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PREFACE 


Prolog is a non-conventional programming language for a wide spectrum 
of applications, including language processing, data base modelling and im¬ 
plementation, symbolic computing, expert systems, computer-aided design, 
simulation, software prototyping and planning. A version of Prolog has been 
chosen as a systems programming language for so-called fifth-generation com¬ 
puters; experiments with systems programming and concurrent programming 
are in progress. 

Prolog, devised by Alain Colmerauer, is a logic programming language. 
Logic programming is a new discipline which lends a unifying view to many 
domains of computer science. Prolog can be classified as a descriptive pro¬ 
gramming language, as opposed to prescriptive (or imperative) languages such 
as Pascal, C and Ada. In-principle, the programmer is only supposed to 
specify what is to be done by his or her program, without bothering with 
how this should be achieved. Robert A. Kowalski has coined the “equation” 

Algorithm = Logic + Control, 

which emphasizes the distinction between the what (logic) and the how (con¬ 
trol). The programmer need not always specify the control component. In 
practice, however, Prolog can be treated as a procedural language. 

Prolog is not standardized, and it comes in many different flavours. 
The most widespread dialect of Prolog is Prolog-10, originally implemented 
by David H. D. Warren for DEC-10 computers. We describe a variant of 
this dialect, based on an interpreter written in Pascal especially for this book. 

The main part of the book is Chapters 1—5. Chapters 1 and 3 
are an introduction to Prolog, intended for those who use prescriptive 
languages in their everyday practice. Both intuitions and the presentation 
are “practically” biased, but we assume the reader has a certain amount of 
programming experience and sophistication. Chapter 2 explains Prolog in 
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terms of logic. It requires no deep knowledge of mathematics and is intended 
as a counterpoint to Chapter 1, but can be skipped on a first reading. Chapter 
4 contains some useful programming techniques and hints. Chapter 5 is a 
reference manual for the version of Prolog described in this book. In addi¬ 
tion, Chapter 8 is a discussion of two rather illuminating applications. 

For those who wish to gain more insight into the language and its inner 
workings. Chapter 6 introduces basic principles of Prolog implementation. 
An implementation of the dialect described in this book is presented in 
Chapter 7. We used this implementation to test our examples, including the 
case studies of Chapter 8. 

Chapter 9, written by Janusz S. Bieh (who also did most of the bibli¬ 
ography), briefly outlines the most characteristic features of several other 
Prolog dialects. 

The diskette enclosed with this book contains source text of all the 
programs listed in Chapter 8 and in the appendices, and of the Toy-Prolog 
interpreter discussed in Chapter 7. The interpreter is written in TURBO 
Pascal. You can use the diskette on any IBM PC compatible computer 
running MS-DOS 2.10 or 3.10. 

The material in this book, supplemented by some additional reading and 
a programming assignment, can be used for a two-semester course at the 
level of third-year computer science majors. Re-implementation of or exten¬ 
sions to the interpreter of Chapter 7 might make interesting assignments for 
a translator-writing course. 

While working on this book, we used the computing facilities of the 
Institute of Informatics, Warsaw University. We would like to thank Pawel 
Gburzynski and Krzysztof Kimbler, who helped us switch almost painlessly 
to a different machine when the one we originally used broke down for a 
protracted period of time. We thank David H. D. Warren for permitting us to 
include the listings of WARPLAN. We are also grateful to all those who have 
provided us with logic programming literature for the past 10 years. 




1 AN INTRODUCTION TO PROLOG 


Prolog is an unconventional language. In particular, its data structures are 
quite different from those found in other programming languages. As it is 
difficult to talk about a computation without understanding the sort of 
data that can be processed, we shall discuss data structures at some 
length before coming to the question of how to do anything with them. 
Have patience. 


1.1. DATA STRUCTURES 


1.1.1. Constants 

Constants are the primitive building blocks of data structures. Con¬ 
stants have no structure, so they are often called “atoms.” They repre¬ 
sent only themselves—they can be thought of as identical with their 
names. 

In Basic or Fortran, 1951 is a constant. The integer variable J is not, 
because it represents both a memory cell and—in certain contexts—a 
value. The value is something quite different from the variable itself. 

One is accustomed to treating 1951 as a number greater than 1948, but 
this is because in programming languages constants usually belong to 
certain types. The usual properties of integer constants (their ordering, 
ability to be used in arithmetic operations, etc.) are taken for granted by 
virtue of their belonging to the type integer, just as in Pascal blue is a 
successor of red when one writes 

colour = ( red, blue, green ) 


l 
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Such type definitions impose a certain structure on the otherwise undiffer¬ 
entiated universe of individual symbolic constants, each of which has 
only one attribute: its name. 

The interpretation of a constant rests solely with the programmer. 
1951 can be the price of a computer, the weight of a truck, the time of day 
or a year of birth. One can always multiply it by 4, but this seldom makes 
sense when it represents a car’s registration number. The constant blue is 
less burdened with inadequate interpretations, but one might wish not to 
have the colours ordered. Constants are the primitives, and collecting 
them into types should only be done when necessary. 

In Prolog, as in other symbolic languages (such as Lisp) there is no 
need to declare constants or group them into types. One can use them 
freely, simply by writing down their names. 

A legal constant name is one of the following: 

—A sequence of digits, possibly prefixed by a minus sign; by convention, 
such constants are called integers (e.g. 0, -7, 1951); 

—An identifier, which may contain letters, digits and underscores but 
must begin with a lower case letter (e.g. q, aName, number_9); 

—A symbol which is a nonempty sequence of any of the following char¬ 
acters: 

+ — */ < = >. :?$&@#\“* 

—Any one of the characters 
, or ; or ! 

—The symbol [] (pronounced “nil”); 

—A quoted name, written according to the Pascal convention for strings: 
an arbitrary sequence of characters enclosed in apostrophes, an 
apostrophe being represented by two consecutive apostrophes (e.g. 
’Can”t do this.’ consists of 14 characters). 

All of these constants are purely symbolic and have no inherent inter¬ 
pretation. However, some primitive operations in Prolog do treat them in 
a special way: 

—Arithmetic operations interpret integers as representations of integer 
values (they can also create new integers); 

—Comparison operations interpret integers as integer values, and all 
other constants as representations of the sequences of characters form¬ 
ing their names (these are lexicographically ordered by the underlying 
collating sequence); 

—Input/output operations interpret all symbols as sequences of charac¬ 
ters forming their names. 
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Each occurrence of a constant’s description (name) is treated as re¬ 
ferring to the same constant, but of course we are free to interpret each 
separately. 


1,1.2. Compound Objects 

An important aspect of the expressive power of a programming lan¬ 
guage is its ability to directly describe various data structures. Of the 
popular and widely used languages, Pascal is the most powerful in this 
respect, but it has several shortcomings. (This is nor a criticism of Pascal: 
our point of view does not take into account important design objectives 
such as a safe type mechanism.) 

Firstly, type definitions in Pascal are overspecified. It is impossible to 
program general algorithms which process stacks or trees regardless of 
the type of their elements. (Records with variants are only a rough ap¬ 
proximation to generic data types found in some more recent program¬ 
ming languages.) 

Secondly, those data structures which change their form dynamically 
can only be built with pointers. One must therefore deal with the struc¬ 
tures at a very low level: the level of representation rather than the con¬ 
ceptual level at which many other things are done in Pascal. Programs 
using pointers are error-prone and hard to understand, because opera¬ 
tions on such data structures are encoded rather than directly expressed. 

The third shortcoming has a similar effect. Ironically, Pascal types 
are also under specified, in that there is no way to directly express certain 
quite natural constraints on the arguments of operations. One cannot say 
that the function POP can only be applied to a non-empty stack; one can 
only write a piece of code (hopefully correct) which checks the argument. 

It is interesting that these shortcomings are not shared by Prolog data 
types (or rather by their counterparts, since “type” is not really a Prolog 
concept). And yet Prolog data structures are very simple. Let us look at 
the details. 

Functors 

To describe a compound object, it is not enough to list its compo¬ 
nents, The ordered pair (19, 24) can be an object of the type 

rectangle - record 

height, width : integer 

end 
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as well as an object of the type 

timeofday = record 

hour, minute : integer 

end 

The complete description of a compound object must include a defini¬ 
tion of its structure. Structure is defined principally by describing the way 
in which the object and its components are interrelated. Describing these 
interrelationships often consists in simply giving a type name to an aggre¬ 
gate of components (as in the example above). It is the programmer’s 
responsibility to interpret this name in terms of real-world relations be¬ 
tween entities being modelled by the program. 

In conventional programming languages, the structure of a compound 
object is usually described in a declaration associating the object with a 
type definition. The type definition lists the name of the object-compo¬ 
nent relationship (type name) and, possibly, additional information about 
the structure (types) of the components. The object is described by its 
name (or the name of a pointer). The name’s definition is textually remote 
from its occurrences. 

A different approach is taken in Prolog. Here, the type name is an 
integral part of all the occurrences of the object’s description. The nota¬ 
tion is very simple: a description of a compound object is the type name 
followed by a parenthesized sequence of descriptions of its components, 
separated by commas. We write either 

rectangle( 19, 24 ) 
or 

timeofday( 19, 24 ). 

The notation is similar to that used for writing functions in mathematics. 
Terminology reflects this similarity. The type-name is called a functor, and the 
components are called arguments. There is more to it than superficial 
similarity of two simple syntactic conventions. One can certainly regard a type 
such as rectangle as a function mapping components into compound objects. 
From this point of view it is not surprising that we can have functors with no 
arguments: these are simply constants. Sometimes it is also useful to have one- 
argument functors. For example, the integer 2 can be represented by the object 
$uccessor(successor(zero)) (the fact that 2 > 1 > 0 is evident from its struc¬ 
ture). 

From the discussion above, it should be obvious that the important 
attributes of a functor are both its name and its arity (i.e. the number of 
arguments it takes). In Prolog, we can use both 

timeofday( 17, 13 ) 
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and 

timeofday( 1713 ) 

in the same program. Even if the intended interpretation is the same, 
these are two different objects: one has two components, and the other 
has one. There are also two different functors, both named timeofday. 
Whenever we speak of a functor in a context which gives no indication of its 
arity, the arity must be given explicitly. The usual notation is to write it after a 
slash: timeofday/2 or timeofday/L 

The lexical rules for forming functor names are the same for all ari- 
ties, but integers can only be constants. Thus 

I23( a, b ) 

is incorrect, but 

U23’( a, b) 

is perfectly all right. Also, [] is only a constant. 

Object Descriptions 

Descriptions of constants and compound objects are referred to as 
terms. Usually, the objects themselves are also called terms: this causes 
no confusion in practice, but in this chapter we shall try to distinguish 
between the two meanings. 

The arguments of a term are arbitrary terms. For example, one can 
write a term describing a “record”: 

customer( name( john, smith ), 

address( street( north_ave ), number ( 173 ))). 

Of the various functors in this example, the outermost, customer/2, can 
be said to define the general structure of the term. It is called the main 
functor, or principal functor. Similarly, name/2 is the main functor of the 
first argument. 

Here is another example of a common data structure. A list can be 
defined as either the empty list, or a list constructed of any object (a head) 
and a list (a tail). A list of the first three letters in the alphabet could then 
be described by the term 

eons( a, cons( b, cons( c, emptylist))). 

The following term would be a description of the two-element list con¬ 
structed of the above list and a list containing the integer zero: 

cons( cons( a,cons( b,cons( c.emptylist ) ) ),cons( 0,emptylist)) 
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Even this small example demonstrates that nested parentheses can be 
difficult to read. Prolog therefore provides syntactic sugar to hide this 
standard or canonic form of terms. Instead of writing 

successor! successor! zero )) 

one can choose to use successor as a prefix functor and write 
successor successor zero, 

Alternatively, successor can be made a postfix functor: 
zero successor successor. 

Functors with two arguments can be declared as infix functors, e.g. 
&( a, b) 

can be written as 
a & b. 

The term 

a & b & c 

would be ambiguous, so an infix functor is either left-associative or right- 
associative (or non-associative, in which case the term is incorrect). If & is 
right associative, the term’s standard form is 

&( a, &{ b, c )) ; 

if & is left-associative, then the term stands for 
&( &{ a, b ), c ). 

We can use parentheses to stress or override associativity, If & is right- 
associative, then 

a & b & c 

is equivalent to 

a & ( b & c ) 

but not to 

( a & b ) & c, 

which stands for 

&( &( a, b ), c ) 

regardless of associativity. 
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To make parentheses even less frequent, prefix, postfix and infix 
functors are given priorities. Functors with lower priority take prece¬ 
dence over those with a higher priority (a Prolog-10 convention, different 
from that used in other programming languages and mathematics). For 
example, if the priority of * is lower than that of + , then 

3*4 + 5 and 5 + 3*4 
denote 

+ ( *( 3, 4 ), 5 ) and +( 5, *( 3, 4 ) ) 

We can use parentheses to stress or override priorities, by writing 

( 3 * 4 ) + 5 or 3 * ( 4 + 5 ). 

Prefix, postfix and infix functors are usually referred to by the generic 
name operators. Remember that these are not operators in any conven¬ 
tional sense: they are only a syntactic convenience. 

Operator names may not be quoted. If an operator is to be written in 
standard form or with a different number of arguments, it must be quoted. 
If + is an infix functor, 

a + b, ’ + ’(a, b) and ’ + ’( a, b, c ) 
are correct terms, but 

+ and +( a, b ) 
are not. 

It is also possible to declare mixed operators, i.e. functors such as the 
minus sign, which is both prefix and infix in ordinary arithmetic. Details 
about declaring prefix, postfix and infix functors can be found in Sections 
5.1 and 5.7.3. 

For the time being, we shall only use infix functors to write terms 
representing lists. However, instead of 

a cons b cons c cons emptylist 

we shall use a more concise notation, modelled after Lisp. The empty list 
will be denoted by the constant [] (pronounced “nil"), and the construct¬ 
ing functor — by the right-associative infix functor .12. Our two lists are 
then written as 

a.b.c.[] 

and 


(a.b.c.f] ).0.[] 
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The convention is arbitrary, in that any constant and two-argument func¬ 
tor would do in place of [] and the dot. It is more convenient than others, 
because these are the symbols expected by several built-in procedures. 

(You can write such terms after feeding Prolog with 

op( 800, xfy, V ). 

However, a minor technical difficulty makes it impossible to use the 
period as a functor when it is immediately followed by a white space 
character, such as blank, tab or new line. This is a nuisance, and Prolog 
provides special syntactic sugar for lists: it is somewhat confusing, so we 
will put it off until Chapter 4.) 

Strings 

Characters are constants whose names consist of single characters. 
One can use quoted names for characters which are not correct identifiers 
(e.g. * ’(’, ’3’; and V is equivalent to x). 

Strings are lists of characters. One can also write them in double 
quotes. For example 

string and 
stand for 

s.t.r.i.n.g.[) and * 

(Actually, the convention adopted in this book is different from that 
of Prolog-10. There, a string denotes a list of ASCII codes and not a list of 
characters, so ’’string” stands for 

115.116.114.105.110.103.[]. 

Similarly, in Prolog-10 operations for reading and writing characters deal 
directly with ASCII codes. We refuse to accept these conventions.) 


1.1.3. Variables 

Objects discussed so far are all, in a sense, constant. Their structure 
is fixed, we know everything about them and cannot learn anything new. 
A programming language in which one could specify only such fully de¬ 
fined objects would hardly be interesting. One must be able to use objects 
whose complete form is defined dynamically during a computation. 

In Prolog, the simplest such as-yet-unknown objects are called vari¬ 
ables (do not confuse them even for a moment with the variables of 
conventional programming languages!). The term denoting a variable is 





t.I. Data Structures 9 


called a variable name (this is also usually called a variable: as with terms 
and objects, we shall try to maintain the distinction throughout chapter I). 
A variable name is written as an identifier starting with an upper case 
letter or an underscore (e.g. Q, Number.9, .nnn). 

A variable is an object whose structure is totally unknown. As a 
computation progresses, the variable may become instantiated, i.e. a 
more precise description of the object may be determined. The term 
embodying this description is called the variable’s instantiation. An in¬ 
stantiated variable is identical with the object described by its instantia¬ 
tion, so it ceases to be a variable, although the object can still be referred 
to through the variable’s name. (In general, a variable may be instantiated 
also to another variable—we shall soon see the meaning of this.) 

There is also an alternative terminology. One says that a free (or 
unbound) variable becomes bound to another term and is henceforth in¬ 
distinguishable from that term (which is called its binding). The variable 
becomes ground if its binding contains no variables. This terminology 
brings to mind the process of binding formal parameters to actual parame¬ 
ters. If the formal parameters were not allowed to change their value (as 
in pure Lisp, say), the similarity would be very close indeed, except that a 
binding need not be ground. 

Intuitively, Prolog variables are somewhat like the variables used in 
mathematics. When we say that 

/(*) = e J + 3x 

is a function of one variable, we mean that the equation allows us to 
determine the function’s value for any (one) given argument. The variable 
denotes a single (albeit arbitrary) substitution and is not in itself an object 
to which values can be assigned. 

You can also regard a Prolog variable as an “invisible” pointer. When 
not free, the pointer is automatically dereferenced in all contexts, so it is 
impossible to distinguish it from the referenced object: in particular, it is 
impossible to exchange the object for something else. 

1.1.4. Terms 

If one thinks of a type as a set of objects, then a term is also a 
definition of a type. The term Variable3 describes the set of all objects, 
because a variable can be instantiated to anything. On the other hand, one 
can have a very precise type specification. For example, the term a.b.c.O 
describes a set containing only one object: the list of length 3, whose first 
element is a, whose second element is b and whose third element is c. 
There is a wide range of choices between these extremes. 
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We describe objects by defining those of their properties which we 
find interesting in a given context. We do so by using variable names to 
denote objects (in particular: components of other objects) whose exact 
form is either unknown or unimportant. Our descriptions thus denote sets 
of objects satisfying the explicitly formulated properties, 

A few examples should make it clear: 

1. painting( Painter, 'Saskia' ) 

—all ’Saskia’s of an unknown artist 

2. painting( rembrandt, Picture ) 

—all pictures by Rembrandt 

3. painting! rembrandt, picture(Title,I646)) 

—all pictures painted by Rembrandt in 1646 

4. Head.Tail 

—all non-empty lists 

5. One.Two.Three.[] 

—all lists of three elements 

6. One.Two.13.Tail 

—all lists containing at least three elements, such that the third ele¬ 
ment is the number 13. 

Actually, our comments in examples 4 and 6 are somewhat imprecise, 
as they reflect an intended interpretation. Since a variable name denotes 
an arbitrary object, the type Head.Tail contains more than true lists: the 
object one.two also answers this description. Similarly, the term in exam¬ 
ple 1 describes objects such as painting{59, ’Saskia’). Term notation does 
not allow us to express directly our wish to consider only paintings whose 
first arguments are the names of painters, This is in keeping with the 
principle that the type of a compound object is defined primarily by the 
interrelationship between the object and its components, rather than by 
the types of the components. The restriction is not necessarily a bad 
thing: we shall see that a procedure popping an element off a stack is most 
naturally written so that it can handle all stacks, whatever the types of 
their elements. If one considers it important to restrict the types of com¬ 
ponents, one can do it easily enough (we shall see how), but only con¬ 
sciously and only when needed. 

As a computation progresses, variables in various terms may become 
instantiated. As a result, more is known about the objects described by 
these terms. We extend our terminology so that we can talk about instan¬ 
tiating terms and terms which are instantiations of other terms. For exam¬ 
ple, f(X).Tail is an instantiation of Head.Tail; and it may, in due course, 
be further instantiated to a yet more precise description. As we shall see. 
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such a multi-step approximation to a desired description is very charac¬ 
teristic of Prolog. 

We have proposed to regard a term as a type definition, i.e. a descrip¬ 
tion of a class of objects, or alternatively as a description of a single, as 
yet undefined object. These are two sides of the same coin. A single 
object whose form is known only in general outline can be thought of as a 
representative of the class of all objects having that form. A term denotes 
a set by virtue of denoting any one of its possible instantiations. 

A comment on the role of variable names. They are used as handles 
on the objects they denote. Through a name we can, at any moment, look 
at what we have actually learned about the shape of the object. For 
example, no matter what the instantiation of Head.Tail, the variable name 
Head denotes the first element of this list. The term X.X.Tail is also quite 
legal in Prolog and denotes a list whose first and second elements are the 
same object. 

Notice that we said “the same object,” not “identical objects.” It is 
important to note that different compound objects can share components. 
In general, Prolog terms describe data structures which can be repre¬ 
sented as directed acyclic graphs (DAGs). If we use an arrow to denote 
the relation of “being built of ’ (X —► Y means that Y is a component of 
X), then Fig. 1.1 illustrates the object denoted by 

one( two( A.B ), three( A.B, C ), B ). 

Sometimes one is not interested in certain objects and needs no name 
to refer to them. Such terms can be denoted by anonymous variables, each 
of which is written as an underscore. For example 

a-b.[] 

describes a list of length four whose first and last elements are a and b. 
The second and third elements can be any two different objects, or the 
same object: we don’t care. 


1.2. OPERATIONS 

The majority of operations in a Prolog program are calls to procedures 
defined by the user. Standard operations—addition, comparisons, input/ 
output etc.—are used relatively infrequently. For uniformity, every oper¬ 
ation is written as if it were a procedure call, and the principal property of 
all standard operations is only that they need not (and must not) be de- 
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FIG. 1.1 Recurring components of an object. 


fined by the user. Standard operations are accordingly called built-in pro¬ 
cedures (or system procedures). 

Procedure calls are written according to the usual practice* The pro¬ 
cedure name is followed by an optional list of terms—actual parameters, 
enclosed in parentheses and separated by commas, e,g, 

show( painting(rembrandt,X), etching(rembrandt,X))* 

The “procedure name 11 is often called a predicate symbol (sometimes 
shortened to predicate)* Like functors, a predicate symbol has two attri- 
butes; a name and an arily, Two distinct procedures can share the same 
name, provided one has a different number of parameters than the other. 
Procedure calls (and, as we shall see, procedure definitions) have the 
same syntax as terms. This notations! uniformity is useful when programs 
are dynamically modified (as is normally the case during an interactive 
session), but it may be confusing for the uninitiated* We shall try to help 
by reserving the word “argument" for components of terms: procedures 
will be said to have parameters* 
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Some versions of Prolog—including those described in this book— 
carry this uniformity to the point of allowing the user to use prefix, postfix 
and infix notation for predicate symbols (such symbols are also called 
“operators”). This is achieved exactly as for functors, but a number of 
frequently used symbols are usually predeclared to give the language a 
more conventional flavour. A case in point is the built-in procedure is, 
whose name is written in infix notation. It expects its second parameter to 
be an “expression”: a term representing the abstract syntax tree of an 
integer arithmetic expression; the tree is evaluated and the result is re¬ 
turned through the first parameter. The two-argument functors 
and mod are predeclared as infix functors with conventional priorities and 
associativity, so one can write 

V is X - 7 - Y * 2 mod ( 2 + X ) 

to instantiate V to an integer (provided the instantiations of X, Y and Z 
are integers or “expressions”). 

Sequences of procedure calls use commas for separators, for ex¬ 
ample: 

... buy( picture(rembrandt,Title), Price ), 

NewPrice is Price * 135/100, 
sell( picture(rembrandt,Title), NewPrice ), 
drink( beer)... 

(We shall enlarge on this in Sections 1.2.2 and 1.3.1.) 

We shall need two built-in procedures for our examples: 

— nl/0 terminates an output line; 

— write/I outputs a term (variables are written as XI, X2, etc.); for 
example, if A and B are uninstantiated, then 

write( f('an id’,g(A,B),7,A)), nl 

writes 

f( an id, g( XI, X2 ), 7, XI ). 

More precise descriptions of system procedures can be found in 
Chapter 5. We shall now see how to define user procedures. 


1.2.1. The Simplest Form of a Procedure 

Try to think of a procedure which computes the head and the tail of a 
list: we shall call it carcdr. What should its specification be like? 

Let the list be the first parameter and let the second and third parame¬ 
ters return its head and its tail. Head and tail are defined only for non- 






14 l An Introduction to Prolog 


empty lists, so the first parameter’s type is described by the term 
Head.Tail (Ql ,Q2 would do just as well, but it is better to use meaningful 
names). This type specification is most naturally written in the procedure 
heading, thus 

carcdr( Head.Tail,.. ) ... 

If a list is denoted by Head.Tail, then Head denotes its head and Tail 
denotes its tail. We can therefore write 

carcdr( Head.Tail, Head, Tail). 

The full stop terminates the specification. It is rather concise, but it con¬ 
tains all the necessary information; the procedure is called carcdr and has 
three parameters; the first parameter must be a non-empty list, the second 
parameter is to become the head, and the third is to become the tail of this 
list. 

It turns out that what we have written is also the complete definition 
of this procedure in Prolog. The call 

carcdr( 1.2.3.[], H, T ) 

instantiates H to I and T to 2.3.[]. (Recall that !.2.3.[] is really 

.(I„(2,.(3,[]») .) 

Actually, our definition is somewhat more general, because—as we 
have already pointed out—Head.Tail need not be a list, as no conditions 
are imposed on the form of its tail. But this does not matter: there is no 
misunderstanding about the desired effect of, say 

carcdr( timeofday( 12, 30 ).any( Object,”at all” ), F, S ). 

What we have here is a general procedure for getting at the first and 
second arguments of a term whose main functor is .12. 

We shall now specify the reverse of carcdr: a procedure which re¬ 
turns, via its third parameter, a list constructed of its first and second 
parameters. We shall call it cons. 

We do not really mind if the first two parameters are not lists, so there 
are no restrictions on their types: 

const Object, Another, ...) ... 

If Object describes an object and Another describes an object, then apply¬ 
ing the list constructor to the two gives us a third object, whose descrip¬ 
tion is Object.Another. This term is sufficient as a specification of the 
third parameter, so we get 

cons( Object, Another, Object.Another ). 
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Here, again, we have a complete definition of this procedure. But 
notice that the order of parameters has no meaning in itself, so we might 
as well have decided to pass the constructed list through the first pa¬ 
rameter: 

const Object.Another, Object, Another ). 

Variable names have no inherent meaning either, so cons is really the 
same as carcdr. Indeed, when we read out the specification of carcdr, we 
cheated a little: “the parameter must be,” or “the parameter is to be¬ 
come”—these distinctions were not present in the specification. 

While both carcdr and cons could be so named to reflect their in¬ 
tended use, they are both really a single procedure 

conscarcdr( Head.Tail, Head, Tail). 

This is not very surprising, as cons is the reverse of the coin of which 
carcdr is the face. 

Now the call 

conscarcdr( 1.2.3.[], H, T ) 
instantiates H to 1 and T to 2.3.[] and the call 
conscarcdrf L, a, b.[] ) 

instantiates L to a.b.[]. But how is it done? We shall come to that, as soon 
as we have cleared up a point of syntax. 


1.2.2, Directives 

In versions of Prolog deriving from Prolog-10, the syntax of a simple 
procedure definition such as our conscarcdr example need not necessarily 
differ from that of a procedure call. The meaning is defined by context. 

Such Prolog systems function in two modes: the command mode and 
the definition mode. Command mode is the default. 

In command mode, the system reads and executes directives. The 
directives are read in from the user’s terminal or from a file. Each direc¬ 
tive is terminated by a fullstop (the character., immediately followed by a 
white space character, including newline), and is either a query or a 
command. A query is a procedure call or a sequence of procedure calls 
separated by commas. Roughly, its execution consists of executing its call 
and printing the resulting variable instantiations (see the end of Section 
1.2.3 fora more precise description). For example, if conscarcdr has been 
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defined, then after reading the query 

conscarcdr( 1.2.3.U, H, T ). 
the system writes 

H = I 
T = 2 . 3 .[] 

(The actual printout might be in the special syntax used for lists; see 
Section 4.2.1.) 

A command has the form of a query prefixed by the symbol Its 
calls are executed but the variable instantiations are not written out auto¬ 
matically. To get the same printout with a command, one would write 

conscarcdr( 1.2.3.[], H, T ), 
write(’H = ’), write( H ), nl, 
write(’T = ’), write( T ), nl. 

The terminology is somewhat fluid: directives are often called goal 
statements, while queries and commands are not always recognized under 
those names. (Sometimes there are also slight syntactic differences. We 
try to follow the original definition of Prolog-10, but in this book the 
standard is set by the version of Prolog described in Chapter 7.) 

Definition mode is entered upon executing the system procedure con¬ 
sult/1 or reconsult/1. (The argument is the name of the file from which 
procedure definitions are to be read; user is the name of the user’s termi¬ 
nal. The details are in Section 5.11.) In this mode, the system accepts 
procedure definitions, which are also terminated by fullstops. Our defini¬ 
tion of conscarcdr is an example, but see Section 1.3.1 for the complete 
syntax. Commands are allowed and properly executed in this mode, but 
queries are not. Definition mode is exited when the system encounters the 
definition 

end. 

A note about comments in Prolog. A comment starts with a % charac¬ 
ter (not contained in a string or quoted name) and extends till the end of 
line. Be careful not to place a comment immediately after a dot that 
terminates a clause: a fullstop is required. 

As a point of interest, all directives and the basic building blocks of 
procedures (called clauses—we will describe them in due time) are simply 
single terms. Standard operator declarations include the infix functor , 
(comma) and the prefix functor so the directive 

:- p( 2, X ), write( X ), nl. 
is really the term 


’:-’( ’,’( p( 2, X ), Y( write( X ), nl))). 
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This data structure is interpreted as a directive, so you need not worry 
about these things unless you are an advanced Prolog hacker. 

There is one important point, though probably you will find it obvi¬ 
ous. The actual parameters of procedure calls are the current instantia¬ 
tions of terms directly written in the call. Thus 

conscarcdrf a.b.[], H, T ), conscarcdr( L, H, T), write( L ), nl. 
will print out 
a.b.H 


1.2.3. Unification 

Since we succeeded in packing the whole definition of conscarcdr into 
its heading—the part specifying its name and formal parameters—we can 
expect that its execution boils down to applying a sufficiently powerful 
and general parameter-passing mechanism. This mechanism is imple¬ 
mented by a term-matching operation called unification. 

We will describe this operation by a pidgin—Pascal algorithm. The 
function UNIFY is applied in turn to each formal and actual parameter pair, 
If it returns true for all such pairs of terms, we say that unification is successful 
(or succeeds); otherwise unification fails. Unification fails when the terms 
describing the parameters do not match. In a very general sense this means 
that the types of actual parameters are incompatible with those of the formal 
parameters. 

function UNIFY ( var Actual, Formal : term ) : boolean; 
var success : boolean; 
begin success: = true; 

if Formal is a variable then 

Formal is instantiated to Actual 

else 

if Actual is a variable then 

Actual is instantiated to Formal 

else 

if the main functors of Formal and Actual have 
different names or arities then success: = false 

else 

while success and unmatched arguments remain do 
success: = UNIFY( next argument of Actual, 
next argument of Formal); 

UNIFY: = success 

end; 
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Notice that if we treat both the call and the procedure heading as 
terms, then the process of matching successive pairs of parameters is 
subsumed by the loop in UNIFY. We extend our terminology accord¬ 
ingly, and say that—like a pair of terms—a call and a procedure heading 
do or do not match. Alternatively, we say that they do or do not unify (are 
or are not unifiable). The algorithm unifies matching terms. Unified terms 
are indistinguishable, so they describe the same object. 

If both Formal and Actual describe variables, then unification binds 
them together. Variables which are bound together also represent the 
same object: both their names refer to the same variable. (It is pointless to 
ask whether the formal becomes an instantiation of the actual or the other 
way round. Our algorithm implements the latter case, but this is not 
observable from the outside. You can envisage a set of bound-together 
variables as a chain of invisible pointers.) 

Time for a very detailed analysis of a simple example: the procedure 

p( A, b( c, A )) 
called with the query 

p( X, b( X, Y ) ). 

Figure 1.2 shows the situation immediately before unification. The 
horizontal line separates objects local to the directive and objects local to 



FIG. 1.2 Unification: before matching. 
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the procedure. Note that objects—including variables—are accessible 
both directly through their names and as components of other objects. 

The first pair of parameters is matched by binding X and A together. 
The variables will behave as if they had merged into a simple object 
(somewhat like two drops of water). This object is accessible under two 
different names (Fig. 1.3). 

The second pair of parameters is unified in two phases. First, c be¬ 
comes the instantiation of the “amalgamated” variables X and A. They 
cease to exist as variables, but c is now also accessible through the name 
A inside the procedure and through the name X outside (Fig. 1.4). 

In the second phase Y is instantiated to the instantiation of A. The 
object c is now accessible as c, A, X and Y (Fig. 1.5). 

The procedure p now terminates, but its local object c remains, being 
accessible from the outside as X or Y. The instantiation of b( X, Y ) is 
b( c, c ) (Fig. 1.6). 

It is convenient to use a special notation for showing the effects of 
unification. We shall write 

A *— B,[] 

instead of “A is instantiated to B.[J”; and 
A «-► B 

instead of “A and B are bound together,” 



FIG. 1.3 Unification: the first pair of parameters is matched. 

















FIG. 1.4 Unification: the first pair of b’s arguments is matched. 



FIG. 1.5 Unification: matching was successful. 



FIG. 1.6 Unification: the callee is terminated. 
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Here are some example calls to our procedure 
conscarcdr( Head.Tail, Head, Tail). 

1. conscarcdr( element.[], Car, Cdr ) 

Head <— element, Tail *- [], 

Car «- element, Cdr «•— []. 

2. conscarcdr( L, one.two.[], 3.4.5.[]) 

L *— Head.Tail, Head «— one.two.[], 

Tail «- 3.4.5.[]. 

Hence the instantiation of L is 
( one.two.[] ).3.4.5.[] 

3. conscarcdr( A.B, 2, 2.[] ) 

A *-> Head, B ** Tail, 

Head <— 2 (and hence also A «— 2 ), 

Tail «— 2.[] (and hence also B *— 2 .[] ). 

The instantiation of A.B is now 

2.2.U 

4. conscarcdr( A.B.C, 10, [] ) 

A «-» Head, Tail *- B.C, 

Head *— 10, failure. 

Unification fails. This is not surprising, as the first actual parameter 
describes a list of at least two elements, while the list constructed of 
the second and third actual parameters would have only one element. 

We shall wind this up with four general remarks. 

First, we want to stress that the unification algorithm treats actual and 
formal parameters absolutely symmetrically. This results in a very char¬ 
acteristic property of Prolog: there is no difference between formal pa¬ 
rameters used to bring information into a procedure and those used to 
carry information out of a procedure. The direction of information flow 
changes from call to call, as in examples 1 and 2 above. We can even 
make a parameter serve both for input and for output. An example is the 
call 

conscarcdrf A.2.[], I, B ) . 

Here, Head «— A and Tail *— 2.[]; 
then Head «- 1 ( and therefore A «— 1 ) 
and B «- 2 .[] . 

The result is that the first formal parameter was used both for obtaining 
information (that 2.[]) and for yielding information (that I). 

This multi-way functioning of procedure parameters sometimes 
makes it possible to use procedures in unexpected ways. Whenever we 
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shall say that a procedure does this and this, we shall not worry about 
what it does after an “unreasonable” call. But you might find thinking 
about these things a useful exercise. 

The second remark: effects of unification such as merging two varia¬ 
bles are quite consistent with the interpretation of terms as descriptions of 
types. We shall illustrate this with a very simple example. The following 
two procedures accept only three-field records whose neighbouring fields 
are identical: 

first2( record( Field 12, Field 12, Field3 ) ). 

Iast2( record( Fieldl, Field23, Field23 )). 

Each of these procedures can be thought of as imposing a constraint on a 
description of the record type. The constraints are not mutually inconsis¬ 
tent, so the directive 

:- first2( record( F1,F2,F3 )), 

Iast2( record( F1,F2,F3 )), 
write( record( F1,F2,F3 )), nl. 

writes out the description of a record whose three fields are identical: 

record( XI, XI, XI ). 

Similarly, the query 

first2( record( FI,F2,field ) ), Iast2( record( Fl,F2,field )). 
is answered with 

FI = field 

F2 = field 

Third, the convention that only the most interesting aspects of an 
object are captured in a type description turned out to be quite useful. 
Example 3 would not have worked if the type of the first formal parameter 
had specified that the tail must be a proper list. The term A.B would not 
have been accepted. 

The fourth, and last, remark. If you follow the unification algorithm 
carefully, you will notice that it can create cyclic data structures. For 
example, if the procedure 

same( X, X ). 
is invoked with 

same( f( V ), V ), write( V ), nl. 
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then we are in trouble. First, X «- f(V); then V «- X, that is to say 
V «- f(V). As a result, f becomes its own component and the printout 
will be potentially infinite: 

f(f(f(f(f(f(f(f(f(f(f(f( .... 

Such cyclic structures can also cause trouble during unification. If we 
write 

same( f( V ), V ), same( f( W ), W ), same( V, W ),... 

then the unification algorithm will not terminate for the third procedure 
call (this will probably manifest itself as recursion stack overflow): f 
matches f, their first arguments are both f, and their first arguments are 
both f, and so on. 

All this could be avoided if a variable were not unifiable with a term in 
which that variable occurs. The unification algorithm is borrowed from 
automatic theorem proving (see Chapter 2). The original algorithm contains 
this occur check, but most versions of Prolog do not, as it considerably 
increases the algorithm’s time complexity. Fortunately, cyclic structures 
seldom occur in practice, and one learns to live with the knowledge that terms 
are not always D AGs if one bl unders badly. One version of Prolog (Prolog II; 
see Section 9.2) is built to take advantage of cyclic data structures. They are 
called infinite trees and are treated as bona fide representations of graphs 
arising in the real world. If one is careful, one can use such structures even in 
more conventional Prolog systems: an example is the calltree program listed in 
Appendix A.4. 


1.2.4. Clauses 

If we want a procedure which computes the fourth element of a list, 
we can write 

fourth(-E4._, E4 ). 

But this method is useless if we want the n-th (or even the hundredth) 
element. 

After parameters are passed, a procedure can—just as in other lan¬ 
guages—execute a sequence of operations. For example, a procedure 
which prints the fourth element of a list would be: 

fourth(-E4._ ) :- write( E4 ), nl. 

Its body is a sequence of calls, separated by commas and prefixed by a 
As you see, a command is like a procedure without a heading. 
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A procedure heading, possibly followed by a body, is called a clause. 
We shall now see how to use clauses for less trivial tasks. 


1.3. CONTROL 

1.3.1. The General Form of a Procedure 

What happens when unification fails? 

Part of the answer is that a procedure can consist of a number of 
clauses. All these clauses must have headings with the same predicate 
symbol, but the parameter specifications may differ. When unification of a 
call with the first clause’s heading is successful, the first clause executes 
its body (if any). When unification fails, its effects are undone: all varia¬ 
bles which were instantiated by the attempt at unification are restored to 
their original, unbound state. The call is then matched against the heading 
of the second clause. If this is successful, the second clause is executed; 
otherwise the third clause is attempted and so on. To execute a procedure 
is thus to execute the first of its clauses whose head matches the call (but 
see the next section for a refinement of this statement). Roughly, the 
matching clause contains code for that particular combination of parame¬ 
ter types. 

An elementary example is provided by an extended version of the 
procedure carcdr of Section 1.2.1. 

carcdr( Head.Tail, Head, Tail). 

carcdr( □,_,_) writef ’can”t crack empty list’ ), nl. 

Here is a somewhat less trivial example, an immortal classic of intro¬ 
ductory Prolog courses. It is a procedure which appends a list at the end 
of another list: 

append( Hd.Tl, List, Hd.TlAndList ) 

append! T!, List, TIAndList ). 

append( [], List, List). 

All terms written in a clause are local to that clause. Both occurrences 
of List in the first clause refer to the same variable, which has nothing to 
do with the variable named List in the second clause. The second clause 
might as well have been 

append! [], QI4, QI4 ). 

As in other programming languages with recursion, activation of a 
clause is accompanied by creation of new instances of all its local objects. 
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The terms appearing in the clause describe these instances. Before an 
attempt to unify a call with a clause heading can be made, a new instance 
of the clause is created. When unification fails, the instance is destroyed. 

Armed with this knowledge, we can now watch the effects of calling 
append in the query 

append! a.b.[], c.d.[], Result). 

For clarity, we shall use X', X”, etc., to denote different instances of a 
variable named X. 

The original call will successfully activate the first clause, after the 
following instantiations: 

Hd’ «- a, Tl’ «- b.[], List’ «- c.d.[], 

Result«— a.TlAndList’ (because Hd’ is now a), 

This clause will execute the call 

append! b.[], c.d.[], TlAndList’ ), 

activating a second instance of the first clause: 

Hd” <- b, Tl” <- [], List” «- c.d.[], 

TlAndList’ <— b.TlAndList” . 

In this instance, the body is 

append! [I. c.d.[], TlAndList” ) . 

This call does not match the first clause’s heading, so the second clause is 
used: 

List’” «- c.d.n, TlAndList” «- c.d.[] . 

The third instance of append has no calls to execute, so it returns to the 
second instance. The second instance is done with its body, so it returns 
to the first instance, which also terminates. The variable Result was in¬ 
stantiated to a.TlAndList', and TlAndList’ to b.TlAndList”, and 
TlAndList” to c.d.f]. Therefore, the query can be answered with 

Result = a.b.c.d.[] 

It is sometimes useful to represent the state of a computation by the 
sequence of calls which must be executed. The sequence is often called 
the current resolvent (see Section 2.4). If we use our procedure in the 
directive 

:• append! a.b.O. c.d.[], Result ), write! Result ), nl, 
then the successive resolvents are as follows: 
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1. append(a.b.[],c.d.[],Result), write(Result), nl. 

2. append(b.[],c.d.[],TlAndList’), write(a.TlAndList’), nl. 

3. append([],c.d,[],TlAndList”), write(a.b.TIAndList’'), nl. 

4. write(a.b.c.d.[]), nl. 

5. nl. 

When no calls remain, the directive is terminated. 

Here is the procedure to find the n-th element of a list. Its first param¬ 
eter is n and the second a list. The third parameter returns the n-th 
element of the list; if the element does not exist, the constant ? is returned 
and an error message is printed. It is assumed that the first parameter is 
not negative (we will learn how to check this in the next section). The 
procedure is 

nth( 0 , write( ’nth( 0 ,..,.. )??’ ), nl. 

nth( N, [], ? ):- write( ’nth( ...too short,.. )??’), nl. 

nth( 1, El._, El ). 

nth( N, —Tail, El)M is N - 1, nth( M, Tail, El). 

Do trace its execution for a few examples. 


1.3.2. Backtracking 

But what if a call matches none of the clause headings? An example is 
the call 

conscarcdr( notalist, something, other ) 

This suggests the answer. If none of the clauses fits the call, then 
evidently the call is wrong: its set of actual parameters does not conform 
to any of the type specifications describing parameters acceptable to the 
procedure. 

As in other modern programming languages, such an erroneous call 
does not abnormally terminate a program’s execution but activates an 
error-handling mechanism. In contrast to other languages, however, the 
error is not necessarily handled by an active procedure present on the 
activation stack. Prolog uses a more general method and takes into ac¬ 
count even those procedures which returned to their caller after success¬ 
ful termination. Procedure instances are looked at, one by one, in reverse 
order of their activation. The nearest such procedure instance—call it p— 
which contains as-yet-unactivated clauses matching its call is assumed to 
be able to handle the situation. The computation is undone: its state is 
made to appear as if the heading of p’s most recently activated clause did 
not match its call, and p is given a chance to execute other clauses. 
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This process is called backtracking, and a call which does not match 
any clause heading is said to fail. Backtracking closely resembles our 
behaviour in systematically searching for a solution to a problem. If we 
end up in a blind alley, we get back to the nearest point at which we could 
have applied another approach, and apply it. If no approach seems to be 
working at that point, we return to the previous point in which we appar¬ 
ently made a wrong choice, and so on. 

In implementation terms, each time a selected clause is not the last in 
its procedure, a record is pushed onto a special stack of fail points (also 
called choice points). The record contains all information necessary to 
restore the state of the computation. When a procedure fails, the topmost 
fail point is popped off the stack, the state described by it is restored and 
the computation proceeds with the next clause. 

It is important to note that not all effects of a computation are obliter¬ 
ated on backtracking. Some system procedures do things which cannot be 
undone, such as writing information on a terminal screen. We say that 
these procedures have side-effects. 

Using our description of backtracking, try to follow the execution of 
procedure p in the following example: 

p q( X), write {trying X )), nl, 
female( X ), write( ok ), nl. 
p writcf ’Sorry!’), nl. 

q( X ) :- writer( X ). 

q( ? ) :- write( ’No more writers.’ ), nl. 

writer( hesse). 
writer( mann). 
writer( grass). 

female( austen ). 
female( sand ). 

You should get the following printout: 

trying( hesse ) 
trying( mann ) 
tryingf grass) 

No more writers. 
trying( ?) 

Sorry! 

You may have noticed from this example that error handling is some¬ 
what inadequate as a metaphor for backtracking. This is the subject of the 
next section. 
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1.3.3. How to Use Backtracking 

When a procedure instance is backtracked to, it behaves as if its most 
recently activated clause did not match its call. We can therefore use 
backtracking to implement extended type checking. 

Recall from Section 1.1,4 that we found it impossible to write terms 
which could describe properties such as “the object is a painter’s name” 
or “the tail is a properly constructed list.” In other words, while rather 
powerful in certain respects, this kind of type specification is weak in 
others. This can be remedied by using procedures which do additional 
type checking and either fail or successfully terminate, depending on the 
outcome. If we want procedure q to accept only properly constructed 
representations of paintings, we can write 

q( painting! Painter, Name ) ):- 
ispainter( Painter ), process! painting! Painter,Name )). 

ispainter! rembrandt ). 

ispainter! velasquez ). 


Here, the role of ispainter is similar to the declaration of an enumeration 
type in Pascal. 

Prolog has several built-in procedures which can be used to check 
properties of objects. For example, one can check whether the object 
denoted by Something is an integer, by seeing whether the call 

integer! Something ) 
succeeds or fails. 

A number of built-in procedures implement comparison operations. 
Like i.v, procedures for comparing integer values “evaluate” terms resem¬ 
bling conventional arithmetic expressions. The procedures are <, =<, 
= := (equality), =\= (inequality), >= and >. Their names are prede¬ 
clared as infix predicate symbols. For example the call 

7*2 + 5 =:= 1+3*6 

will be successful. There are also procedures comparing non-integer con¬ 
stants according to their lexicographic ordering: @<, @ = <, @> = , @>. 
For example, 

alpha @> beta 
is a failing call. 

Equality of constants can be determined by means of the procedure = 
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(= is predeclared as infix). The procedure is most easily expressed in 
Prolog 

X = X. 

It can be used for any two terms, but of course it does more than checking 
equality. It may cause its parameters to become equal, as it attempts to 
unify them. For example, 

a( b, X ) = a( Y, c ) 

will succeed after instantiating 

X «-c, Y «-b . 

Note that 7=7 succeeds, but 5 + 2=24-5 fails, as these are different terms. 

Remember that all these procedures do not yield a Boolean result: 
they only succeed or fail. 

When we are interested in the structure of a compound object, we can 
use a recursive procedure which does nothing but “accepting” the object. 
Here is a version of carcdr which works only for true lists and fails for 
objects such as a.b.c (but not for objects with variable tails, which match 
[])• 


carcdrf Head.Tail, Head, Tail)islist{ Tail). 

islist( [] ). 

is!ist( _.L ) islistf L ). 

Such type checking can be quite general. For example, we can pro¬ 
cess objects differently according to whether they are or are not members 
of a set represented by a list: 

process( Obj, Set) member( Obj, Set), 
yes_action( Obj ). 

process( Obj, _ )no_action( Obj ). 

member( El, El.Tail ). 

member( El, —Tail) memberf El, Tail). 

Try to trace the execution of process for a couple of simple calls, and 
notice how member is called with successively shorter tails of the list, 
until it finds a tail whose head is unifiable with the first parameter. 

The fact that the first clause of member expresses unifiability rather 
than equality has very interesting consequences. The most obvious is that 
the procedure can be used to retrieve information from a dictionary repre¬ 
sented by a list. The call 
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member( phone( krull, Number ), 

phone( mann.l I ).phone{ hesse,5 ).phone( krulU I ).{]) 

instantiates Number to 11. 

When this information-retrieving effect is coupled with backtracking, 
the result is rather striking. Consider the procedure 

intersect! LI, L2 ) :- member! E, LI ), member! E, L2 ). 

When given two sets represented by lists, the procedure terminates suc¬ 
cessfully if the sets intersect and fails if they are disjoint. Here is a trace of 
what happens when we call it with 

intersect! a.b.c.d.[], c.d.[]), write! ok ), nl. 

1. intersect(a.b.c.d,[],c.d.[]), write(ok), nl. 

(this activates the procedure: 

LI «- a.b.c.d,[], L2 «- c.d.[] 

2. member(E,a.b.c.d.[J), member(E,c.d.[]), write(ok), nl. 

(activates the first clause of member: 

E** El*, El’ «— a) 

3. member(a,c.d.'[]), write(ok), nl. 

(only the second clause matches the call) 

4. member(a,d.[]), write(ok), nl. 

(only the second clause matches the call) 

5. member(a,[]), write(ok), nl. 

(the call to member fails, nearest “handler” is in the procedure acti¬ 
vated in step 2, so we backtrack to that situation) 

6. member(E,a.b.c.d.[]), member(E,c.d.[]), write(ok), nl. 

(the second clause now: 

E *+ El’, Tail’ «- b.c.d.l]) 

7. member(E,b.c.d.[]), member(E,c.d.[]), write(ok), nl. 

(the first clause: 

E ++ El”, El" «-b) 

8. membcr(b,c.d.[ 1), write(ok), nl. 

9. member(b,d.[]), write(ok), nl. 

10. member(b,[]), write(ok), nl. 

(failure, backtracking to step 7) 

11. member(E,b.c.d.[]), member(E,c.d.[]), write(ok), nl. 

(the second clause now: 

E <-+ El”’, Tail’” *- c.d.[]) 

12. member(E,c.d.[]),member(E,c.d.[]), write(ok), nl. 

(the first clause: 

E El””, El”” c ) 







1.3. Conlrol 31 


13. member(c,c.d.[]), write(ok), nl. 

(the first clause) 

14. write(ok), nl. 

15. nl. 

Success. 

Notice how the first call to member in intersect is used as a “back¬ 
track driven” generator of successive elements on the list. A terminated 
procedure can be reactivated if the effects of its execution prove unsatis¬ 
factory. It can return several results—or behave in several ways—and its 
final effect is determined not only by its actual parameters, but also by 
what happens to the computation later on. It is this multiplicity of possible 
behaviours that we have in mind when we say that, in general, a Prolog 
procedure is nondeterministic. (This does not mean that its behaviour 
cannot be predicted to the smallest detail.) 

If one wants to see all the results produced by a nondeterministic 
procedure, one can force Prolog to backtrack by calling an undefined 
procedure (the call will fail, because there is no matching clause). It is 
customary to use the name fail, both for readability and because Prolog 
makes it impossible to declare a procedure with this name. To print the 
elements of a list, one can write 

:- member( E, a.b.c.f] ), write( E ), nl, fail. 

Alternatively, one can use a query. After answering a query, the 
system accepts a single printing character from the terminal. If the char¬ 
acter is a semicolon, it backtracks; otherwise it terminates the query. 
When all the possibilities are exhausted, the word no is printed and the 
system reads another directive. For example, 

user: female( W). 

system: W = austen 
user: ; 

system: W = sand 
user: ; 

system: no 

If a successful query contains no non-anonymous variables (i.e. no instan¬ 
tiations to show), the answer is yes. 

Our intersect example does more than check for common elements. If 
the elements are not ground, the sets are modified. For example, 

intersect( one.X.three.[]. 1.Y.( ]) 
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succeeds after binding Y to one. This has a natural explanation. Since Y is 
unknown, we cannot say that the sets do not intersect, but by binding Y 
we ensure that the computation will fail if the supposition that Y is one 
will turn out to be unacceptable. We shall then assume that X is 1, that X 
and Y are the same object, etc., etc. 


1.3.4. Static Interpretation of Procedures 

Detailed simulation of a program is not a very attractive way of learn* 
ing its meaning. We insisted on doing it to help you understand what 
happens inside the computer and to introduce techniques which can 
sometimes be useful for debugging, when things are not happening the 
way they should. But it is often quite clear what should happen, as many 
Prolog procedures can be read without giving a thought to details of 
execution. 

A clause which has no body is called a unit clause. It is a direct 
definition of a relation between its parameters. The clause 

phone( hermann, 5 ). 

says that hermann and 5 are in the relation phone. Other clauses can 
extend the relation to other objects: 

phone( mann, 11). 
phone( hesse, 5 ). 
phone( krull, 11 ). 

Unary relations can be thought of as expressing properties of objects: 

red( herring). 
red( square ). 

Nullary relations can denote general facts: 
tired. 

debugging, 

A look at the clauses of phone tells that the call 
phone( siddhartha, N ) 
will fail, and the call 
phone( Who, 5 ) 

will nondeterministically produce hermann and (after a failure) hesse. 
Note that the calls can be read as "establish whether the actual parame- 
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ters are in the relation phone, i.e. succeed if they are in the relation, or 
instantiate them so that they will be in the relation and succeed, or fail.” 

Somewhat less trivially, the unit clause 

conscarcdr( Head.Tail, Head, Tail ). 

can be used to establish whether the first parameter is a list formed of the 
second and third parameters. Lt is self-evident that 

1. conscarcdr( a, b, c ) fails, because the objects are certainly not in the 
relation; 

2. conscarcdr( A.2.B, I, C.[] ) succeeds, because there does exist a list 
of at least two elements whose second element is 2, such that its head 
is I and its tail is a one-element list—the list is 1.2.U and the tail is 2.[]; 

3. conscarcdr( A, B, B.[] ) succeeds, because there do exist objects A 
and B such that A is a list constructed of B and B.[]—A is B.B.[] and 
B can be any object. 

Nonunit clauses are indirect definitions of relations. Thus 

append( [], L, L ). 

append( H.T, L, H.TL ) append( T, L, TL ). 
can be read as 

"L is L appended to [],” and 

“H.TL is L appended to H.T if TL is L appended to T.” 

It is usually convenient to flavour this a little with the intended meaning, 
as in 


“a list L appended to an empty list is L itself,” and 
“a list L appended to a non-empty list H.T is formed of the head of 
that list, H, and the result of appending L to its tail, T.” 

And, most spectacularly, 

intersect( LI, L2 ) memberf E,L1 ), member( E,L2 ). 
reads: 


“LI and L2 intersect if an object E is a member of LI and a member 
of L2”, 

in other words 

“two lists intersect if they have a common member.” 

You will find more about this in Chapter 2. But note here that this 
interpretation does not fully explain procedures such as process of Sec¬ 
tion 1.3.3—this is further discussed in Section 4,3.1. 
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1.3.5. The Order of Calls and Clauses 

In practice, static interpretation is not always sufficient to explain a 
program’s behaviour. It cannot account for the order of calls in a clause 
and the order of clauses in a procedure, because “x and y” means the 
same as “y and x.” Yet this order is important, for three principal 
reasons. 

The first reason is that some procedures, such as write and nl, have 
side-effects, i.e. their results are not only variable instantiations, The 
order in which several things are written has an obvious effect on the form 
of the printout. 

Another important reason is efficiency, Here is a famous example 
(Kowalski 1974) of a naive naive sort: 

sort( List, Sorted ) permute( List, Sorted ), 
ordered( Sorted). 

The procedure generates successive permutations of a list until it finds 
one that is ordered. If permute and ordered can be used both to check 
their parameters and as generators, then this could also be expressed as 

sort( List, Sorted ) ordered( Sorted ), 

permute( List, Sorted ). 

Here, successive ordered lists are generated until a permutation of the 
first parameter is found. Both procedures express the same definition of a 
sorted list, but while the first is only very costly, the second is absolutely 
useless. 

A third reason is that all computations should be finite. We will illus¬ 
trate this point with the procedure append, which can be written either as 

append( H.T, L, H.TL ) append( T, L, TL ). 
append( [], L, L ). 

or, apparently equivalently, as 

append( [], L, L ). 

append( H.T, L, H.TL ) append{ T, L, TL ). 

Both versions are equivalent when append is used for appending. But 
note that its precise reading from section 1.3.4 allows for other uses. For 
example the second clause, “H.TL is L appended to H,T if TL is L 
appended to T,” defines H.TL in terms of H.T and L, but also H.T and L 
in terms of H.TL. Indeed, append is often used for splitting a list. If one 
executes 
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append( Front, End, a.b.c.[]), 
write( Front ), write( ’ & ’ 
write( End ), nl, fail. 

then the first version of append will produce (after successive failures) 

a.b.c.U & [] 
a.b.[] & c.[J 
a.[] & b.c.[] 

[] & a.b.c.[] 

and the second version 

[] &a.b.c.[] 
a.[] & b.c.[] 
a.b,[] & c.[] 
a.b.c.[] & []. 

This difference is not very important. But when we write 
append( LI, a.[], L3 ) 

we expect that append will succeed, after instantiating the terms so that 
L3 is a.[] appended to LI. The second version does just this: LI «- [] and 
L3 «- a.[]; then, if we backtrack, LI «- XI.(] and L3 «- Xl.a.U; then, if 
we backtrack again, LI *- XI.X2.[] and L3 «- Xl.X2.a.[]; and so on— 
there are infinitely many such solutions. 

The first procedure, however, first looks for the last solution in this 
infinite set, and this causes endless recursion. 

Nevertheless, with careful programming, considerations of this sort 
are needed only to obtain refinements of the general meaning of proce¬ 
dures given by their static interpretation. Moreover, the order of calls and 
clauses is usually a local thing, seldom requiring looking beyond a single 
procedure. 

1.3.6. The Cut 

We shall now pass on to so-called extralogical features of Prolog. 
These are simple and powerful mechanisms which play a large part in 
making Prolog a practical programming language, but cannot be under¬ 
stood in terms of static interpretation, as outlined in Section 1,3.4. 

Since we have generators, we must be able to stop them. Suppose 
that we have two methods for finding the solution of a problem described 
in terms of two sets. Assume one of these methods is significantly cheaper 
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than the other, but a necessary—though not sufficient!—condition for its 
applicability is that the problem-defining sets intersect, We might write 
something like 

try( Setl, Set2, Solution ) 
intersect( Setl, Set2 ), 
method I ( Setl, Set2, Solution ). 

try( Setl, Set2, Solution ) 

method2( Setl, Set2, Solution ). 

Now if method! fails, we want to try method2. But if the sets are large 
and have many elements in common, we are effectively stopped by a 
generator. Backtracking from method 1 will cause intersect to find another 
way of showing that the sets do indeed intersect: this changes nothing, so 
method! will be attempted again and again until intersect enumerates all 
the elements in the intersection of Setl and Set2. In terms of processing 
time, this might be a disaster. And note that we are lucky: the generator is 
not infinite. 

To help in such cases, Prolog provides a commit operation, written as 
! and called the cut procedure (old Prolog hands tend to call it the slash, 
after the character/, which was its name in the original Marseilles Prolog). 
When procedure p executes a cut, everything that was done by p up to 
that moment—including its choice of current clause—is taken as fixed 
and not to be reconsidered on backtracking. In implementation terms, ! 
cuts away the top section of the fail point stack, leaving only fail points 
created before p was called. 

Our problem can be solved by modifying intersect: 

intersect LI, L2 ) :- member( E, LI ), 
member{ E, L2 ), !. 

The cut kills the generator of elements from LI. 

A more involved example might be useful in clearing up doubts about 
the effects of a cut. We will try to move a single cut around in our example 
of section 1.3.2: 

p q( X ), write( tryingt X )), nl, 
female/ X ), write/ ok ), nl. 

p :- write( ’Sorry!’ ), nl. 

q( X ) :- writer( X ). 

q( ? ) :- write( ’No more writers.’ ), nl. 

writer/ hesse). 

writer/ mann ). 

writer( grass ). 
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fema!e( austen ). 
female! sand ). 

If we insert a cut into the first clause of writer: 

writer( hesse ) !. 

the printout will be 

trying! hesse) 

No more writers. 
trying( ?) 

Sorry! 

If we insert it into q instead: 

q( X ):- !, writer( X ). 

we will get 

trying! hesse ) 
trying! mann ) 
trying! grass ) 

Sorry! 

But if we choose to insert it at the end of this clause: 

q( X ) :- writer! X ), !. 

the program will write 

trying! hesse ) 

Sorry! 

By inserting the cut after the call to q in the first clause of p, we would 
obtain only 

trying! hesse) 

As evidenced by these examples, the cut is a powerful tool. A single 
cut can drastically alter the behaviour of a program. It must be used very 
carefully: Section 4.3.1 contains some useful hints. 

An important property of the cut is that it can be used to implement a 
sort of negation. When we want to list all male writers, we can write 

:- writer! X ), male! X ), write! X ), nl, fail, 

If, however, the program contains only descriptions of female persons (as 
in our example), we must define male in terms of female: 

male! X ):-female! X ), !, fail, 
male! - ). 
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When X is such that female succeeds, the second clause of male is cut off 
and the whole procedure fails. When female fails, the second clause 
takes over and the procedure succeeds. The trick is dirty, but very useful. 
One must be careful, however: if the constant christie is not listed among 
the females, male(christie) will succeed. (More on this in Section 4.3.2.) 


1.3.7. Variable Calls 

The negation schema shown in the previous section is of quite general 
utility. For example, we could write a procedure for checking that two 
sets (represented as lists) do not intersect: 

disjoint SI, S2 )intersect! SI, S2 ), 1, fail, 
disjoint! -> - )• 

Prolog provides a very convenient extension which allows us to use 
such schemas without going to the trouble of rewriting them again and 
again. A variable call is a variable occupying the position of a call in a 
clause or directive. When the turn comes to execute the call occupying 
this position, the variable’s current instantiation is taken as the call, by 
treating its main functor as a predicate symbol and its arguments as pa¬ 
rameters. If we define 

do( X ) :- X 

then the call 

carcdr( el.[], A, B ) 

is exactly equivalent to 

do( carcdr( el.[]. A, B )) 

as well as to 

do( do( carcdr( el.[], A, B ))). 

We can use this feature to define 

not( X ) :- X, !, fail. 
not( _ ). 

and write 

male( X ):- not( female( X )). 

disjoint! SI, S2 ) :* not( intersect! SI, S2 ) ). 

The dirty trick is now nicely packaged. 
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In versions of Prolog described here, not is predefined and the predi¬ 
cate symbol is predeclared as a prefix symbol. Expanding male in-line, we 
would write the directive of Section 1.3.6 as 

:- writer( X ), not female! X ), write! X ), nl, fail. 

Variable calls can be used to define many useful procedures. We shall 
end by showing two companions of "not”: "and” and "or.” The first is 
written as a comma and the second as a semicolon; the first succeeds 
when both its parameters—taken as calls—succeed, and the second suc¬ 
ceeds when either of its parameters succeeds (but establishes a fail point if 
it is the first one). Their definitions are 

Y( A, B ) :- A, B. 

7(A,_):-A. 

YU, B ) B. 

Comma and semicolon are predeclared as infix symbols. After defin¬ 
ing do , we could write the directive above as 

:- do( ( writer! X ), not female! X ), write! X ), nl, fail)). 

The extra parentheses are needed to avoid confusion with a call to do/5. 

Priorities are chosen so that 

artwork! X, Y ):- painting! X, Y ), oil! Y ); 

etching! X, Y ), brass! Y ). 


is equivalent to 

artwork! X, Y ) :- ';*(printing! X,Y ),oil( Y )), 

V! etching! X,Y ),brass( Y ) ) ). 

To make the comma and semicolon appear a part of Prolog’s syntax, 
Prolog-10 and some of its offsprings made the cut behave somewhat dif¬ 
ferently for these procedures: they are "transparent” to it. Thus 

artwork! X, Y ) :- painting! X, Y), oil! Y ), ! ; 

etching! X, Y ), brass! Y ). 

avoids checking the second alternative if the first succeeds. 

A similar exception applies to variable calls. If the procedure 

a! X) :- b, X. 
a(_ ) :- c. 
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is called with 

a( ( d, !, fail)) 

then the cut will commit all choices made by d and b and a—the proce¬ 
dure will fail without executing c. 

One should avoid taking advantage of this peculiar property of the 
cut. It is doubtful whether it is necessary. 







2 PROLOG AND LOGIC 


2.1. INTRODUCTION 


“Prolog” stands for “Programmation en logjque” (programming in 
logic). Static interpretation of procedures (see Section 1.3.4) is possible 
because Prolog can also be viewed as a system for proving theorems 
expressed in logic. Adopting this viewpoint can provide the programmer 
with new insights about the nature of his task. 

In this chapter we attempt to introduce the fundamentals of this as¬ 
pect of Prolog in an intuitive manner. Full appreciation of the subject is 
possible only for people with a solid background in mathematical logic, 
and we assume your knowledge oflogic is very elementary. Consequently, the 
presentation is often not sufficiently precise, and sometimes the terminology is 
a little unconventional: we are interested in Prolog rather than logic. The 
chapter is a shortcut, so in some places you will find it heavy going. A more 
detailed, but still non-technical treatment can be found in Kowalski (1979b). 
Another relevant book is Robinson (1979). See also van Emden and Kowalski 
(1979). 


2.2. FORMULAE AND THEIR 
INTERPRETATIONS 


Below is a pair of formulae written in the language of predicate logic 
(also known as first-order logic or predicate calculus): 

(2.1) Vx D(Z, x, x) 

(2.2) Vx Vy Vz D( x, y, z ) => D( S( x ), y, S( z )). 
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The basic building blocks of such formulae are predicates. A predicate 
consists of a predicate symbol (e.g. D), optionally followed by argu¬ 
ments—a list of terms in parentheses, separated by commas. A term is a 
variable (e.g. x, y, z), or a functor (e.g. Z, S) with an optional list of 
arguments, which are terms. Terms denote objects in some universe 
(more on this presently) and predicates stand for relations between these 
objects. 

A single predicate is a formula. A larger formula can be built from 
simpler ones by means of logical connectives. The commonly used con¬ 
nectives, listed in order of decreasing priority, are 

—the negation (“not”), written as -> 

—the conjunction (“and”), written as A 
—the disjunction (“or”), written as V 
—the implication, written as 

Parentheses can be used to increase clarity or override priority. 

A formula (i.e. also a subformula) can be prefixed by a number of 
quantifiers, whose priority is lower than that of the connectives. A quanti¬ 
fier can be 

—the existential quantifier, written as 3x and read as “there exists an x”. 
—the universal quantifier, written as Vx and read as "for all x”, or “for 
any x”. 

The formula prefixed by a quantifier is called its scope, and the quantified 
variable is local to this scope (an occurrence of its name outside the scope 
does not denote the same object). In this chapter we shall deal only with 
fully quantified formulae; i.e. our formulae will not contain unquantified 
variables. 

Our example formulae can be read as 

“for any object—call it x—the object Z is in relation D with x and x” 

and 

"for any three (not necessarily distinct) objects—call them x, y and 
z—if x, y and z are in relation D, then so are objects S( x ), y and 
S( z )”. 

In practice, it is more convenient to use a slightly abbreviated reading, in 
which the second formula is 

“for all x, y and z, D( x, y, z ) implies D( S( x ), y, S( z ))”. 

Formulae of this kind are purely formal statements. One cannot dis¬ 
cuss whether they are true or false, because no particular meaning is 
attributed to the functors and predicate symbols. To talk about a formu¬ 
la’s meaning, we must give it an interpretation. An interpretation is a 
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definition of a universe (the set of objects which can be denoted by terms) 
and a decision to let predicate symbols and functors denote particular 
relations and functions defined in this universe. 

A concrete interpretation maps a (fully quantified) formula to a state¬ 
ment which is true or false, depending on what it says about relations 
between objects. Somewhat imprecisely, we shall say that a formula is 
true (or false) in an interpretation. Of course, some formulae are true in all 
interpretations (the formula true is a trivial example, and A V A is 
another); others are false in all interpretations (e.g. false, A A -» A). The 
first kind of formulae are called tautologies; formulae of the second kind 
are called inconsistent. 

As an example, consider the following two interpretations of formu¬ 
lae (2.1) and (2.2). The first interpretation is the following: 

—the universe is the set of natural numbers (positive integers); 

—Z stands for the number 1 (one); 

—S stands for the function S(x) = 2x; 

—D(x, y, z) is true if and only if xy - z. 

Our formulae now become the true statements 

“for any natural number x, lx = x” 

and 


“for all natural numbers x, y and z, xy = z implies 2xy = 2z”. 
Another interpretation is: 

—the universe is the set of non-negative integers; 

—Z stands for the number 0 (zero); 

—S stands for the successor function S(x) = x+1; 

—D(x,y,z) is true if and only if x+y = z. 

The formulae are now 

“for any non-negative integer x, 0+x = x” 

and 

“for all integers x, y and z, x+y = z implies (x+l)+y = z+1”. 

If an interpretation maps a formula into a true statement, then this 
interpretation is called a model of this formula. We can also speak about a 
model of a set of formulae—an interpretation in which all of them are true. 

Our two interpretations are models of the example formulae. If Z 
stood for 1 in the second interpretation, then it would not be a model. 
When an interpretation interests us as a model, formulae which are true 
(or false) in that interpretation will be referred to as true (or false) in the 
model. 
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All interpretations are models of tautologies. Inconsistent formulae 
have no models. 

When we want to talk about a particular model, we prefer to use 
symbols which have some mnemonic value. The formula 

(2.3) Vh Vt conscarcdr{ .( h,t ), h, t) 

can be interpreted as 

“for all integers h and t, the difference between h+t and h is t”, 
but this is better written as 

Vh Vt difference( +( h,t), h, t). 

The similarity is interesting, though: looking for other models of our 
statement of a problem often provides illuminating insights into its nature. 

Notice that the “natural” interpretation of formula (2.3) is very 
down-to-earth. A list constructor can be thought of as a function mapping 
two objects (a head and a tail) into a third object: the universe can be a set 
of data structures. 


2.3. FORMAL REASONING 


The notion of logical consequence allows us to perform formal reason¬ 
ing, i.e. reasoning which takes into account only the syntactic form of 
formulae and disregards their interpretations. We say that formula a is a 
logical consequence of a set of formulae j3,/3',/T... if all models of the set 
(}, (3"... are also models of a. It is a fundamental fact of logic that there 

exist inference rules, which are correct recipes for deriving logical conse¬ 
quences (conclusions) of other formulae (premises), provided the latter 
have a certain form. The inference rules are usually quite simple, but we 
can use them as elementary steps in long derivations. This is the back¬ 
bone of mathematics: a set of formulae (axioms) defines a theory, which is 
the set of all formulae (called theorems) true in all models of the axioms; a 
formal derivation of a new theorem is called its proof. (The name axioms 
is often reserved for a minimal set of theorems specifying the theory of 
interest. We find it more convenient to use the name for any “given” set 
of theorems accepted without proofs.) 

Some inference rules are relatively trivial applications of the defini¬ 
tions of logical connectives. A well-known example is the modus ponens: 

“from any formula a and from any formula of the form a ^ /3, derive 
the formula /S”. 
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Now the definition of implication can be stated as follows: if a and p are 
arbitrary formulae, then, in any interpretation, a ^ p is false if and only if 
a is true and p is false in that interpretation. Hence, any model of both a 
and a => P must also be a model of p. 

Two other simple rules are 

"a => /3 is equivalent to —> a V0. 

(i.e. one can be derived from the other)” 

and one of the De Morgan laws 

{ a A p ) is equivalent to —> a V -1 P”- 

Do convince yourself of their validity—we will need them presently! 

Armed with a number of inference rules, we can attempt to derive a 
formula directly or by means of a technique known as reductio ad absur- 
dum. To derive formula a from a set of axioms, assume that —> a is a 
theorem: if the resulting theory is inconsistent, then a is a theorem. A 
theory is inconsistent if it contains an inconsistent formula, In this method 
of proof we often show inconsistency by finding a formula p such that we 
can derive 

p A-- p. 

It is worth noting that all formulae are theorems of an inconsistent 
theory. This is because, there being no models of the theory, no formula is 
false in any of the models. (This might not have sounded too convincing, 
but notice that if we can derive false, then we can derive any formula a 
using modus ponens and false a. For any a, the formula false => tx is a 
tautology, because it is equivalent to -> false V a, that is to say true V a.) 
Consequently, if the set of formulae 

- 11 a 

P 

P' 


is inconsistent, then a is certainly a theorem, regardless of whether the set 
/3, p', ... is consistent or not. 


2.4. RESOLUTION AND HORN CLAUSES 


We shall be interested in an inference rule which we shall call the rule 
of resolution (Robinson 1965). It says 

‘‘from —> a V P an£ I from a Vi derive p V y"• 
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Its validity is not hard to explain. In any model of -> a V P and a\y y, 
either a is false or a is false. In the first case /3 must be true (or else 
a V P would not be true), in the second y must be true. If a model of 
-i a V fi and a V y must also be a model of /3 or a model of y, then—by 
definition of disjunction—it is a model of 0 V Y- 

There are two interesting special cases of this rule. One is 

“from a V iS and from a derive /3”, 

and the other is 

“from -i a and from a derive 

Here □ stands for the empty formula, which must be treated as equiva¬ 
lent to false if this form of the rule is to be valid. 

The rule of resolution is useful for reductio ad absurdum proofs when 
our formulae are written in a restricted form called clausal form. A clause 
is a disjunction of literals. A literal is either a predicate (called positive 
literal) or a negated predicate (called negative literal). All clauses are 
prefixed by universal quantifiers, one for each variable in the clause. 

We shall limit our attention to Horn clauses, which have at most one 
positive literal each. Here is a set of four Horn clauses (the predicates are 
all nullary): 

AyBvC 

B y -i D 

C 

D 

Now, if we want to prove that A can be derived from these clauses, we 
can use the rule of resolution to show that by adding the Horn clause 

-i A 

to our set of formulae, we obtain an inconsistent set of clauses. The proof 
can be carried out in the following four steps (we use parentheses to make 
things more clear): 

1. from -i A and from Ayf^BvC) derive - 1 B V -1 C 

2. from -'BV* | C and from B V n D derive -> C V D 

3. from -> C V -1 D and from C derive -> D 

4. from — 1 D and from D derive □ 

Notice that this type of reductio ad absurdum proof is successful 
when we derive the empty clause □ (i.e, false). The special cases of the 
resolution rule are used to shorten formulae, while the general rule is used 











2 .4. Resolution and Horn Clauses 47 


to generate formulae which can be shortened. Now, if in an application of 
the resolution rule both the premises have one positive literal each, then 
the conclusion must also have one positive literal (do you see why?). 
Hence, the proof cannot be successful unless at least one of the clauses 
has no positive literals. However, if one of the premises has only negative 
literals, then so has the conclusion. If only one of the initial clauses has this 
form, then the proof can be made particularly simple (Kowalski and Kuehner 
1971; Hill 1974). One of the premises in the first step is the clause without 
positive literals. If this step cannot derive the empty clause, then the second 
step must use the only other clause without positive literals, i.e. the conclusion 
of the preceding step, and so on. If—as in the example—all our axioms have 
positive literals, then the negated theorem must have none and the final proof 
has the form of an orderly chain, in which each step provides a premise for the 
immediately succeeding one. In each step, we shall call the premise without 
positive literals the current resolvent. 

Each step consists in cancelling a negative literal in the current 
resolvent, by replacing it with the negative literals of a clause having X as 
its only positive literal. The resolvent shrinks when one of its literals is 
cancelled with a unit clause, which has only a single positive literal. 

Because a clause is a disjunction of literals, it can be written as an 
implication. By the De Morgan law (see Section 2.3) Avl^BvC) 
is equivalent to Av(BA C). This, in turn, is equivalent to 
BAC^A. 

By analogy, we can write 
BAC^ 

to denote ""B V n C (i.e. a Horn clause with no positive literals). The 
empty consequent represents false, since false (or its equivalent) is the 
only formula a such that a v B V C) is equivalent to -> B v -i C, 
for all B and C. 

Similarly, we shall denote A, a Horn clause with no negative literals, 
by 

=> A 

Here, the empty premise represents true: A is equivalent to true => A. An 
empty clause has no literals and can be written 

=> 

i.e. true => false, which is equivalent to false. 

It is preferable to write the implication from right to left: 

A <£ B A C 
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to suggest the reading 

"to prove A, prove B and C”. 

Recalling that our resolvents have no positive literals, we can now 
write the resolution rule as 

“from <= a A 0 and from a <£ y derive <£ y A 0” 

where 0, y or both may be empty (in that case we do not write a A). This 
is a rather mechanical prescription: to get rid of a, find an implication 
whose consequent is a and replace a with its premises. This is clearly 
justified: since a can be proven by proving y, then ct a fi can be proven by 
proving y a fi. 

A clause being always prefixed with universal quantifiers for each of 
its variables, it is convenient not to write the quantifiers. Our formulae 
from Section 2.2 are two Horn clauses, written as 

D( Z, x, x K= 

D( S( x ), y, S( z ) )<= D( x, y, z ) 

Let us see whether these clauses are consistent with 
<= D( S( Z ), x, S( S( Z))) 

The clause can be thought of as a query whether there exists an x which is 
in relation D with S(Z) and S(S(Z)). It is equivalent to 

Vx D( S( Z ), x, S( S( Z ))), 

and since this is the negation of what we are trying to prove, our derived 
formula—if we succeed—will be 

3x D( S( Z ), x, S( S< Z ) ) ) 

(if a is not false for all x, then there must be at least one x for which a is 
true). 

The rule of resolution, in the form presented above, is useless for this 
example. In fact, it could only be used for nullary predicates, because the 
argument for its validity does not apply to premises such as 

Vx -i A( x ) V B( x ) and Vy A( Z ) V C( y ). 

Fortunately, we can also employ a simple inference rule called the 
substitution rule. It says 

“from Vx a( x ) derive «(t), where r is an arbitrary term”. 

Here, «(x) means that the formula a contains occurrences of variable x. 
a ( T ) stands for a formula which looks exactly like a, except that all occur- 
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rences of x have been replaced by occurrences of term t. An example of 
the rule's application is 

“from Vx D( S( Z ), x, S( x ) ) derive 

D( S( Z ), S( S( Z )), S( S( S( Z ) ))” 

The substitution rule is valid, of course. In every model of Vx a(x), a is 
true for any object x (that is what the quantifier says!): hence, it is true for 
any particular object. 

It should now be clear that 

“from Vx a( x ) and Vy /3( y ) derive Vz «( z ) A fi( z )” 

is also a valid inference rule. It can be looked on as an application of the 
substitution rule: in all models of Vx a(x) and Vy /3(y), a(r) and /3(r) are 
true for any r, hence (by the definition of conjunction) a(r) A /3(t) is true 
for any r, and therefore Vz a(z) A /3(z) is true. 

The substitution rule allows us to match different formulae by using 
appropriate variable substitutions. For example, we can easily show that 
—1 D(x, y, z) is inconsistent with D(Z, Z, Z), because we can derive 
D(Z, Z, Z) from the first formula by substituting Z for x, y and z. We 
shall denote such substitutions by 

x «- Z, y 4- Z, z 4- Z. 

Our ability to match formulae allows us to apply to the problem at 
hand implications which express general rules. This is best illustrated 
with our running example. (We use apostrophes to distinguish between 
variables similarly named, but quantified in different scopes or used in 
different applications of a formula.) 

1. Match <= D(S(Z), x, S(S(Z))) 

and D(S(x’), y’, S(z’)) <= D(x’, y\ z’) 
by substituting x' <— Z, y’ <— x, z’ *- S(Z). 

2. From D(S(Z), x, S(S(Z))) 

and D(S(Z), x, S(S(Z)) D(Z, x, S(Z)) 
derive <= D(Z, x, S(Z)> by the rule of resolution. 

3. Match <= D(Z, x, S(Z)) and D(Z, x”, x”) 
by substituting x” «— x, x *— S(Z). 

4. From D(Z, S(Z), S(Z)) and D(Z, S(Z), S(Z)) 

derive □ (the empty resolvent) by the rule of resolution. 

A noteworthy feature of this example is that the term S(Z), finally 
substituted for the x in 


<= D< S( Z ), x, S( S( Z ))) 
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can be thought of as a counterexample to the disproved hypothesis that 
this formula is consistent with the others. We learned in effect that, in any 
model of the two clauses playing the role of axioms, 

“it is not true that Vx -i D( S( Z ), x, S( S( Z ))), because 
-> D( S( Z ), x, S( S( Z ) ) ) is false when x is S( Z )”. 

But this is the same as saying that our question 

“does there exist an x such that D{ S( Z ), x, S{ S( Z )))” 

is answered with 

“yes, S( Z ) is such an x”. 

Recall our two interpretations from Section 2.2. In the first we asked 
whether there is an x such that 2x = 4, and the answer is that 2 is such an 
x. In the second interpretation, the question whether there is an x such 
that I + x = 2 was answered by 1. 

The various substitutions were used to narrow the set of interesting 
objects to those objects for which the formula being disproved is not true. 
Indeed, it is evident that for all x other than S(Z), the formula 
-> D(S(Z), x, S(S(Z))> is true in both interpretations. It is so in all models of 
the two original formulae, but we shall not attempt to justify it directly. 
Our example would be an “indirect” justification if we could be certain 
that the substitutions did not “lose” other objects satisfying the dis¬ 
proved formula. Such certainty would be of practical value, because the 
answer to our query (i.e. our counterexample) can be a term containing 
variables. We want it to be as general as possible, in the sense that the set 
of all terms obtainable by substituting something for its variables should 
be the set of all answers. 

It is a fundamental fact of resolution theory that the algorithm of 
unification (as presented in Section 1.2.3, but extended with the occur 
check) finds the most general set of substitutions needed to match two 
literals. “Most general” means that it is contained by all sets of substitu¬ 
tions which make the literals match. When we match A(x) and -> A(y), 
both x «- y and y «- x are possible—we treat them as indistinguishable. 
In a sense, this is the minimal necessary set of substitutions. 

We used unification in our example, so we did not lose any solutions. 
Notice, however, that our discussion is concerned with the effects of the 
substitution rule; as we shall see, different proofs can come up with 
different solutions. Try the conscarcdr and append examples from Chap¬ 
ter 1 to get a feeling for this kind of proof. You may use infix notation for 
functors—it does not matter. The intersection example might be a little 
more difficult; read on, 
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2.5. STRATEGY 

Disjunction being commutative, we can apply the rule of resolution to 
any literal in the current resolvent: in our examples, we always chose the 
leftmost one. The choice does not affect our ability to finish the proof, as 
we must be able to cancel all the literals before obtaining the empty 
resolvent. As it turns out, the desirable properties of unification men¬ 
tioned in the previous section ensure that the order in which the cancel¬ 
ling of literals is performed does not influence the final outcome of the 
proof. The length of a proof, however, can be affected by the choice of 
literals very strongly indeed. We shall discuss this matter at the end of this 
section. 

In our examples, at most one clause could be used to cancel a literal in 
each step. In general, a number of clauses can be applicable (after suitable 
matching) to a given literal. Choosing the right clause could be important, 
because some of them can lead into “blind alleys”. After many steps, we 
may turn up with a resolvent to which the rule of resolution cannot be 
applied (because there are no matching clauses for its literals), even 
though another choice of clause at an earlier step might have speedily led 
to the empty resolvent. 

The situation is illustrated by Fig. 2.1b, which shows part of the 
search space for the problem listed in Fig. 2.1a. The space is tree-shaped: 
each path from the root to a leaf represents a possible derivation se¬ 
quence; its nodes are labelled with the successive resolvents. Some of the 
paths are successful, some end in failure. 

Notice that several subtrees occur more than once. This effect would 
be less pronounced if the resolvents reflected the history of substitutions, 
but we did not feel up to creating such a drawing for predicates with 
arguments. Try it for the application of intersect traced in Section 1.3.3. 
(Figure 2.5 will show you how to do it.) 

Prolog always tries to use the leftmost literal, so its search space is 
considerably smaller, as illustrated in Fig. 2.1c. Whenever it is presented 
with a number of applicable clauses, the system always attempts the first 
one first. When it encounters failure, it backtracks and tries another path. 
In effect, it executes an orderly preorder search of the search space tree. 

Figure 2.1c also illustrates the effect of a cut: a part of the search 
space is shorn off, but one must be aware that this part may contain 
solutions! While this is not important in the example (if we expect a yes/ 
no answer), in general different solutions may represent essentially differ¬ 
ent instantiations (i.e. substitutions) of variables in the root of the tree. 

This is not in contradiction to our earlier statements about “the desir¬ 
able properties of unification”. As evidenced by Fig. 2.2, by choosing 
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different literals we change only the order in which things are proved, but 
the general structure of the proof is not changed. It is convenient to 
represent the structure of a proof by means of a proof tree (do not confuse 
it with the search space), as illustrated in Fig. 2.3. The first tree shows the 
proofs of the preceding figure, which all used B and C to prove A, and F 
to prove B. The other proof trees represent classes of proofs obtained 
through a different choice of clauses: the proofs are carried out quite 
differently. 

Structurally different proofs use different subsets of the available 
clauses for performing various subproofs, so the “counterexamples” of 
Section 2.4—which are descriptions of sets of objects for which the 
clauses used cannot all be true—may turn out to be different. Therefore, not 
all solutions are the same. 

Proof trees are interesting also because they reflect the invocation 
tree when clauses are treated as procedures. As long as there are no 
failures, the conventional procedure activation stack can be regarded as 
an equally conventional stack used for preorder traversal of the proof 
tree, or—if failure is imminent—of a quasi-proof tree which has a failure 



8 <= D a E 
B -e= F 

C <= F a D 
D <= 


The axioms The {negated) theorem 

FIG. 2.1 (a) A search space: the source formulae* (b) A search space: an initial part of 

the complete space, (Choice of literals denoted by an arc, choice of clauses by a dot.) (c) A 
search space: Prolog search space has no choices for literals, (Solution one is found, the 
others could be found on backtracking. ! marks the subspaces made unreachable by execut¬ 
ing a cut at the end of A's first clause.) (continued) 
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FIG. 2A ( Continued ) 
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FIG. 2.1 ( Continued ) 


in its rightmost leaf (see Fig. 2.4). Backtracking to the nearest choice 
point in the search space may reopen an attempt to prove a node in a 
finished branch of the quasi-proof tree, i.e. to reactivate a terminated 
procedure (this would happen if D had a second clause in our example). 
Obviously, this cannot be done with a single simple stack: the matter is 
discussed in Chapter 6. 

The word strategy refers to the way in which a theorem-prover 
(whether automatic or human) finds its way through a search space. In 
logic programming literature the preferred term is control. It is not all a 
question of choosing the order of clauses and literals. Some logic pro¬ 
gramming systems employ no backtracking, choosing to cover the search 
space in a breadth-first manner. This helps avoid problems caused by 
misapplied generators and left recursion (see Fig. 2.5 and Section 3.5.2). 
There is, of course, the problem of potentially exponential memory re¬ 
quirements. Prolog’s appetite for memory is at most roughly linear in 
number of attempted derivation steps. This is paid for with the cost of 
backtracking: the time complexity is still exponential. 













-i A 


v 


iB v-iC 



The clauses used 

The tree of possible proof paths in these proofs 

FIG. 2,2 Choice of literals with fixed choice of clauses from Fig. 2. i. (Prolog would 
choose the leftmost path, but only after some backtracking caused by choosing the first 
clause of B.) 



FIG. 2.3 Proof trees for the example of Fig. 2.1. (The leftmost tree represents the 
proofs of Fig. 2.2. Backtracking would cause Prolog to build each tree in turn, from left to 
right.) 
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FIG. 2.4 A quasi-proof tree, representing a failing attempt to prove B by using its first 
clause, (The right son of A was not generated.) 
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There have been attempts to decrease this cost by means of a more 
sophisticated backtracking strategy (Bruynooghe 1978, Pereira and Porto 
1980b, 1982, Bruynooghe and Pereira 1981). 

The more ambitious scheme, appropriately called “intelligent back¬ 
tracking”, attempts to retain subproofs (which would otherwise have 
been discarded) in order to avoid recomputing them again and again. In 
other words, it attempts to take advantage of the multiplicity of identical 
subtrees in a search space (compare Fig. 2.1b). 

A simpler approach, called “selective backtracking”, consists in ana¬ 
lysing which variable instantiations caused the failure. It is then possible 
to backtrack directly to the nearest point where one of these instantiations 
was made or where the computation would take an entirely different 
course. In some cases this can save us a lot of thrashing about in a failure- 
infested region of the tree’s crown. 

Unfortunately, these interesting ideas have not influenced Prolog im¬ 
plementations. They require a further complication of the already com¬ 
plex runtime data structures and they do not mesh well with side-effects 
of system procedures. The fact that Prolog can be used as a practical 
language is still largely due to our dexterity in fighting exponential com¬ 
plexity with the cut. 

Attempts to modify Prolog’s strategy so that it would incorporate 
parallelism or coroutining have been a little more successful. Parallelism 
consists in growing various branches of a proof tree (or even several 
trees) simultaneously. It is difficult, not only because it raises tricky tech¬ 
nical problems, but because we still lack sufficient understanding of its 
effects on both time /space complexity and the number of solutions gained 
or lost. Several very different approaches have been documented, but 
none of them seems to answer all pertinent questions. Some of the refer¬ 
ences are (Clark and Gregory 1983, Shapiro 1983b, Conery and Kibler 
1983, Wise 1984, Eisinger el al. 1982). 

Coroutining is just that: switching control between several active pro¬ 
cedures. In terms of our drawings, coroutining is a non-trivial traversal of 
a proof tree. Roughly, it is a matter of choosing a different order of literals 
(a different path in Fig. 2.2). It is possible to demonstrate spectacular 
improvements in the performance of some programs when they are exe¬ 
cuted in coroutining fashion. A simple example is the naive naive sort (see 
Section 1.3.5): 

sort( List, Sorted ) permute( List, Sorted ), 
ordered( Sorted ). 

When the execution of permute is interleaved with that of ordered so that 
the latter can cause failure as soon as the first out-of-order element is 
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produced by the former, the program’s behaviour compares very fa¬ 
vourably with that shown when each permutation must be completed 
before ordered is called. When the initial sequence of a permutation is 
rejected, all other permutations starting with the same sequence can at 
once be rejected as well, resulting in a significant reduction of the search 
space. 

Section 9.2 contains two short examples of coroutining Prolog pro¬ 
grams. Coroutining comes in many flavours: some of the references are 
(Clark et al. 1979, Clark et al. 1982, Porto 1982, Colmerauer et al. 1983). 
Unfortunately, most of these schemes are of restricted utility (Kluiniak 
1981). The problem is that coroutines do not mesh well with backtracking. 
We comment on this at greater length in (Kluzniak and Szpakowicz 1984). 





3 METAMORPHOSIS GRAMMARS 
A POWERFUL EXTENSION 


3.1. PROLOG REPRESENTATION 
OF THE PARSING PROBLEM 

We shall begin with a very simple formulation of the parsing problem: 
given a sequence of items, find out whether it has some presupposed 
structure. The problem appears e.g. in programming languages when we 
want to make sure that some text is a syntactically valid statement. Ad¬ 
missible structures are usually described by a context-free grammar. As 
an example we shall consider the following small grammar in Backus- 
Naur-Form, which describes simple list expressions: 

< list >::=())(< items > ) 

< items > ::= < item > | < item >, < items > 

(3.1) < item > ::= < atom > | < list > 

< atom > ::= < letter > | < letter > < atom > 

< letter >::=a|b|c|d|e|f|g|h|i|j|k(l|m| 

n|o|p|q|r|s|t|u|v|w|x|y|z 

Terminal symbols of this grammar are small letters, round brackets, 
and a comma. For example, the list 

(a, (big,ox)) 

consists of 12 terminal symbols. 

There are several commonly used methods of describing the structure 
of a list (or, more generally, of a valid sequence of terminal symbols). The 
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method we adopt here leads to an elegant formulation of the parsing 
problem in Prolog'. 

We shall depict a sequence of terminal symbols in a graph (3.2): 


Every node in this graph corresponds to a boundary between two consec¬ 
utive terminal symbols; every edge connecting two nodes corresponds to 
the terminal symbol it is labelled with. Two edges are contiguous if they 
share a node; a sequence e ( , e w of edges is contiguous if e, and e,+ i are 

contiguous for / = 1, 2. m — 1, For example, the edges labelled b, i, g 

are contiguous. The labels of contiguous edges are also contiguous. 

A sequence of contiguous labels may constitute a whole which is 
meaningful in that it corresponds to the right-hand side of a production. 
For example, the (only) label of the one-element sequence of edges 

o 


constitutes a letter; the labels of the sequence 

0 x 

•-X- ** 

constitute an atom. 

We shall describe such meaningful combinations by connecting the 
extreme nodes of a contiguous sequence by an edge. The edge will be 
labelled with the name of an appropriate non-terminal symbol, as for 
example in Fig. 3.1. 

To be able to represent graphs in a program, we must give each node a 
unique name. For example, we can name nodes with numbers: 

( a , ( big,o xl ) 

1 2 3 4 5 6 7 S 9 10 II 12 13 

We can represent such a graph as a set of edges, every edge ex¬ 
pressed by a unit clause 1 2 that specifies the label of the edge and the names 
of the nodes it connects. Perhaps the most compact way is to use the label 
as the clause name, e.g. 

atom( 9, 11 ). 

letterf 9, 10). 

o( 9, 10). 

' This manner of presentation is due to Colmerauer; it was also used by Kowalski 
(1979b). 

! For other ways of representing graphs in Prolog, see Sections 4,2.4 and 4,4.3. 












3.1. Representation of the Parsing Problem 61 



FIG. 3.] Meaningful combinations of edges. 


We now observe that clauses which represent edges labelled with 
non-terminal symbols might be derived from those corresponding to ter¬ 
minal symbols, by virtue of general structural relationships inherent in the 
grammar. The reasoning would be roughly as follows: 

letter( 9, 10 ) because o( 9, 10 ): o is a letter; 
letter! 10, 11 ) because x( 10, 11 ): x is a letter; 
atom( 10, 11 ) because letter! 10, II ): a letter makes an atom; 
atom( 9, 11 ) because letter! 9, 10 ) and atom! 10, II ): a letter and 
an atom make an atom. 

Relationships of this kind can be generalized in a straightforward 
manner, e.g. 

letter! K, L ) o( K, L ). 

letter! K, L ):- x( K, L ). 

atom! K, L ) :- letter! K, L ). 

atom! K, M ):- letter! K, L ), atom! L, M ). 

Contiguity of edges is assured by using the same term (variable name) to 
denote every intermediate node: once at the end of an edge and once at 
the beginning of the next one, 

Given the clauses that describe edges with terminal symbols, e.g. 


(3.4) 


o( 9, 10 ). 
x( 10, 11 ). 


we might now derive all the remaining relevant edges. Strictly speaking, 
they would be present only implicitly. For example, to confirm the pres¬ 
ence of the edge 

atom! 9, 11 ) 

we would issue the command 


:- atom! 9, 11 ). 


from which the following computation might ensue: 
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atom( 9, 11 ). 

Ietter( 9, L ), atom( L, 11 ). 
o( 9, L ), atom( L, 11 ). 

(3.5) L «- 10 

atom( 10, 11 ). 
letter( 10, 11 ). 
x( 10, 11 ). 
success 

The method of specifying the initial graph is rather awkward, even for 
this small example. Moreover, it requires that terminal symbols be only 
identifiers (nullary functors)—the restriction is unnatural but, fortunately, 
unnecessary. We shall now describe a slightly different and much handier 
notation. 

Names of nodes need not be consecutive integers. On the contrary, it 
is much better to derive (unique) names from the original sequence of 
terminal symbols than to introduce another, completely independent no¬ 
menclature. We shall exploit the one-one correspondence between a 
node and the sequence of (contiguous) edges following it. As the name of 
a node we shall take the list of terminal symbols labelling the correspond¬ 
ing sequence. For example, the leftmost node of the graph (3.2) will be 
named 

’(’•a. Y.'(’.b.i.g.y.o.x. 

and the name of the rightmost one—corresponding to the empty sequence 
of nodes—will be 

[] 

With this notation, the (implicit) clause describing the atom ox becomes 
atom( o.x.). 

Notice how the underlying sequence of terminal symbols can be seen 
without resorting to separate clauses for o and x: it is simply the “differ¬ 
ence” of the first and the second node names, o and x in our case, For a 
terminal symbol this difference is guaranteed to consist of the symbol 
itself, as, say, in 

x 


X.T.’I'.C] V.T.C] 

In other words, if an edge connects the nodes X, Y and is labelled with the 
terminal symbol T, then 


X = T.Y 
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In order to allow arbitrary terms as terminal symbols, we can write, 
e.g. 

terminaK o, K, L ) 

instead of o(K, L), Moreover, rather than writing 


terminal( o, o.x.’)V)\[], x.’)Y)\[]). 
terminaK x, x.’)V)’.[], T.T-U )■ 


we shall use the general-purpose, one-clause auxiliary procedure 
terminaK T, T.Y, Y ). 

However, now we need some other way of specifying the initial sequence 
of terminal symbols, which in the previous formulation could be read 
from the assertions (3.4). Before we explain this, we shall rewrite (3.3): 

letter( K, L ) :- terminaK o, K, L ). 
letter( K, L ) :- terminaK x, K, L ). 
atom( K, L ) :- letter( K, L ). 
atom( K, M ):- letter( K, L ), atom( L, M ). 
terminaK T, T.Y, Y ). 

The computation analogous to that shown in (3.5) would now look as 
follows: 


atom( o.x.T.’)’■[], T.’JM] )• 

letter( o.x.y.y.[], L), atom( L, ’)’.’)’.[]). 

terminaK o, o.x.’)Y)\[], L ), atom( L, ’)’.’)’.[] )• 

L <- x.’)’.’)’.[] 

(3.6) atom( x.yy)■ 
letter( x.’)’.’)’.f]» T-T-H )• 
terminal! x, x.’)'.'’)’,[), ). 

success 

All the necessary information about the initial graph was supplied by the 
first call. What is more, the graph itself is now implicit: we only get—and 
manipulate—the two sequences of terminal symbols. 

We are now in a good position to restate each instance of the parsing 
problem in terms of Prolog. A grammar is given in the form of Prolog 
clauses, each clause corresponding to some structural relationship be¬ 
tween a unit and its immediate components (in particular, to a BNF rule). 
For example, 

items( K, N ) 

item( K, L ), terminaK Y, L, M ), items! M, N ). 
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A call on one of these clauses (or, to be more precise, on the procedure to 
which it belongs) fully specifies two l'sts of terminal symbols, the second 
being the tail of the first. As a matter of convention, the clause name will 
also be the name of a nonterminal symbol, i.e. it will tell us what structure 
we want to attribute to the underlying sequence of terminal symbols. For 
example, the call 

items( b.i.g.7.o.x.’)V)\[], 

can be interpreted as the question: In the graph determined by the param¬ 
eters, can an edge labelled with items be validly drawn between the ex¬ 
treme nodes? Or briefly: Is items the valid structure of a given sequence 
of terminal symbols, big,ox in our case? 

The answer to this question is YES if the call succeeds, and NO 
otherwise. Examples of unsuccessful attempts to parse are; 

items( \\o.x.). 

/items cannot begin with a comma/ 
atom( o.x.’)’.’)’.[]). 

/atom cannot end with a bracket/ 

Procedural interpretation can be expressed in terms of the successive 
augmentation of the original graph. Every successful call implicitly adds 
an edge. Parsing succeeds if we can connect the extreme nodes with a 
single edge. This construction proceeds bottom-up: we can imagine an 
edge being added only after the successful termination of a corresponding 
call. 

We shall illustrate this by a complete program for parsing lists. 

list( K, M ) :- terminal T, K, L ), terminaK ’)’, L, M ). 
list( K, N ) :- 

terminal! T. K, L ), items( L, M ), terminal* ’)’, M, N ). 
items( K, L ):- item( K, L ). 
items( K, N ) :- 

item( K, L ), terminal( L, M ), items( M, N ). 
item( K, L ) :- atom( K, L ). 
item* K, L) list( K, L ). 
atom( K, L ) :- letter( K, L ). 
atom( K, M ) letter* K, L ), atom* L, M ). 
letter* K, L ) :- terminal* a, K, L ). 


letter* K, L ) :- terminal* z, K, L ). 
terminal* T, T.Y, Y ). ■ 
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The call on list in the command 

list( T.a.’.'.Y.b.i.g.V.o.x.’)'.')'.!], []). 

results in the implicit construction of the graph shown in Fig. 3.2. Notice 
the similarity of this graph to a conventional parse tree (Fig. 3.3). 

The parameters of a call that initiates the parsing serve as an input 
and an output parameter. The former contains a given list of terminal 
symbols. Some initial segment of this list is supposed to constitute the 
unit under consideration. For example, in 

atom( o.x.’)’.’)’.(], ’)’.’)’.[J) 

we expect that some initial part of the list 

o.x.’)’.’)’.[] 

constitutes an atom. Should that be the case, the computation succeeds 
provided the second parameter matches the tail of the list which remains 
after “chopping off” the initial segment. For example, ’)’.’)’.[] remains 
after chopping o and x off the list o.x.’)’.In most cases the second 
parameter is a variable, so that it actually behaves like an output parame¬ 
ter. As an example, the call 

(3.8) atom( o.x.’)’.’)’.[], Tail) 

instantiates Tail as ’)’,’)’.[]—compare this with (3.6). 


list 
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FIG. 3,3 The parse tree for the list (a,(bLg 5 ox)}. 


If the second parameter is a variable, only the entry node of some 
subgraph of the whole graph is known. Parsing then may give ambiguous 
results. For example, the call 

items( b.i.g.Y.o.x.’)’.Tail) 

might succeed with Tail instantiated to \’.o,x.or to ’)’.’)’.[]• In 
general, the results depend on how the clauses of a parsing program are 
ordered. In the program above, the recursive clause for items would only 
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be activated because of forced failure coming after a successful parsing 
of big as items. 

Recall now that the parameter of a Prolog procedure can, in principle, 
be bi-directional, the direction—input or output—depending on the form 
of the corresponding actual parameter. This also applies to calls that 
initiate parsing. If the first parameter is a variable, what we ask is whether 
there exists a sequence of terminal symbols that has a particular struc¬ 
ture. For example, the call 

list{ AList, []) 

should instantiate AList to any valid list of terminal symbols; in other 
words, some list should be constructed, or synthesized. One example of 
such a list is the empty list. 

However, the situation is not fully symmetric. For any given se¬ 
quence of terminal symbols, a call on list either succeeds or fails, i.e. 
every sequence can be classified as a list or a non-list—can be syntacti¬ 
cally analysed. Not so with synthesis. It is easy to see that the two calls 

list( AList, []), fail 

will act as a generator of one-element lists: 

() (a) (b)... (z) (aa) (ab) ... (az) (aaa) (aab) ,.. 

Moreover, if we reorder the two clauses for item, the call on item with a 
variable first parameter would result in infinite recursion. 


3.2. THE SIMPLEST FORM 
OF GRAMMAR RULES 


The input and output parameters of the clauses that constitute a pars¬ 
ing program, such as (3.7), are the basis of yet another interpretation of 
those clauses: in terms of operations on sequences of terminal symbols. 
Take the clause 

items( K, N ) :- item( K, L ), terminal L, M ), items( M, N ). 

It can be read as follows: (an instance of) items can be “chopped off” 
(recognized at the beginning of) K, leaving N, if (an instance of) item can 
be chopped off K, leaving L, and then a comma can be chopped off L, 
leaving M, and finally (another instance of) items can be chopped off M, 
leaving N. Now the essence of all this is that items consist of an item, a 
comma, and items. The other information can be routinely added to this 
fundamental fact. All we need is four variables to stand for successive 
remainders of the initial sequence of terminal symbols. 
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In the notation we shall use henceforth, this routine information is 
suppressed. The notation resembles BNF productions. The lefthand side 
of a Prolog grammar rule names the construction, and the righthand side 
enumerates its constituents. For example: 

atom —* letter, atom. 

The symbol —> is rendered in Prolog as --> (it must be written without 
intervening blanks). There is a simple convention to distinguish nontermi¬ 
nal and terminal symbols: the latter are enclosed in square brackets, e.g. 

items —» item, [ ], items. 

Contiguous terminal symbols can be enclosed in a single pair of brackets. 
For example, the rule for empty lists can be written as 

list -► [ T, ’)’ ]. 

If all terminal symbols are characters (one-character nullary functors), we 
can use string notation: 

list -* ”()”• 

Such grammar rules are merely syntactic sugar for the underlying 
clauses. The translation is fairly straightforward, the gain in clarity signifi¬ 
cant. However, some Prolog implementations, especially on small com¬ 
puters, do not support grammar rule notation. Even then it seems worth¬ 
while to write a preprocessor in Prolog (we shall describe such a 
preprocessor in Section 7.4.4). 

The counterpart of a parsing program, written down as a collection of 
grammar rules, will be called a metamorphosis grammar 3 , or grammar for 
short. Here is the grammar of lists, corresponding to the program (3.7). 

list-MT, T 1‘ 

list -»■ [ ’(’ 1> items, ['*)*]. 

items —► item. 

items —» item, [ ], items. 

item —* atom. 

item —> list. 

atom —* letter. 

atom —* letter, atom. 

letter -* [ a ]. 


letter -* [ z ]. 

3 This is the name invented by Colmerauer (1975, 1978). The name “definite clause 
grammars” was later introduced by Pereira and Warren (1980) for metamorphosis grammars 
in normal form (as defined by Colmerauer). 








3.3. Parameters of Non-Terminat Symbols 69 


The procedure terminal need not be explicitly given (it ought to be pro¬ 
vided by the implementation). 

This grammar deserves its name. It is best understood independently 
of the Prolog program it has been used to conceal. Every rule reflects the 
“consist of” relationship between a whole and its constituents, exactly as 
the original BNF grammar does. However, it should be remembered that 
the grammar is also a program in disguise, and is executable immediately , 
without any additional effort on the programmer’s part! 

Parsing can be initiated in two ways. First, we can simply call one of 
the underlying procedures, e.g. 

list( ’(’.a.’,’.’(’.b.i.g.’,’.o.x.[]). 

Second, we can use the built-in procedure phrase with two parameters: 
the nonterminal symbol and the sequence of terminal symbols (which is 
supposed to be an instance of the nonterminal). For example: 

phrase( list, ’(’.a.’,’.’(’.b.i.g.Y.o.x.’)’.’)’,[]). 

It should be pointed out that the first way brings out the routine informa¬ 
tion we just managed to hide. On the other hand, the second way is less 
flexible, e.g. we cannot use phrase to perform calls such as (3.8). 


3.3. PARAMETERS OF NON-TERMINAL 
SYMBOLS 


Grammars of the kind described so far are of little practical use. We 
seldom parse anything just to accept or reject it. More often than not, we 
need to compute the representation of its structure or to transform it 
somehow, and we must do this while accepting the input. The representa¬ 
tion of the structure will be built step by step, with the terminal symbols 
taken into account in succession. 

We shall give an example. Suppose we want to build a parse tree—a 
Prolog term—for every valid sequence of terminal symbols that consti¬ 
tute a list; see Fig. 3.3. To this end, we shall give each of the procedures in 
(3.7) an additional parameter to hold the representation (of a structure) to 
be constructed upon exit from the procedure. We must not meddle with 
input and output parameters: their role remains the same as before. Here 
is the program. 

listf list( ’(’, ’)’ ), K, M ) :- 

terminal) K, L ), terminal) ’)’, L, M ). 
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list( list( T, ITEMS, ’)‘), K, N ) :■ 

terminaK K, L ), items( ITEMS, L, M ), 
terminaK ’)’, M, N ). 

items{ items* ITEM ), K, L ) item( ITEM, K, L ). 
items( items( ITEM, \\ ITEMS ), K, N ) :- 
item( ITEM, K, L ), terminal L, M ), 
items( ITEMS, M, N ). 

item( item( ATOM ), K, L ) atom( ATOM, K, L ), 
item( item( LIST ), K, L ) list( LIST, K, L ). 
atom( atom( LETTER ), K, L ) letter* LETTER, K, L ). 
atom( atom( LETTER, ATOM ), K, M ) 

letter* LETTER, K, L ), atom* ATOM, L, M ). 
letter* letter* a ), K, L ) terminal* a, K, L ). 


letter* letter* z ), K, L ) :- terminal* z, K, L ). 

Again, we shall suppress the routine information, i.e. leave out the 
input and output parameters. The resulting grammar will be as follows: 

list* list* V, ’)*))-►[ T, ’)’ ]. 
list* list* ITEMS, ’)’))-> 

[ *(*), items* ITEMS ), [ ’)* J. 
items* items* ITEM ) ) -* item* ITEM ). 
items* items* ITEM, 7, ITEMS )) -» 
item* ITEM items* ITEMS ). 

item* item* ATOM ) ) atom* ATOM ). 
item* item* LIST )) list* LIST ). 
atom* atom* LETTER )) —> letter* LETTER ). 
atom* atom* LETTER, ATOM ) ) —* letter* LETTER ), 
atom* ATOM ). 
letter* letter* a )) -* [ a ]. 


letter* letter* z )) -* [ z ]. 

To compute the parse tree of Fig. 3.3, call: 

phrase* list* T ), T -a/, 7<\b.i.g.7.o.x.y .*)’.[] ). 

The conciseness and power of metamorphosis grammars can hardly 
be appreciated in this tiny example. We shall show a grammar that de¬ 
scribes (and parses) sequences of statements of a simple programming 
language. The admissible statements are: assignment, if-then-else-fi, 
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if 



/\ 
A A 

i 1 i 1 

FIG. 3.4 An abstract syntax tree* 


while-do-od, and skip. The sequencing operator is the semicolon. The 
condition is either an arithmetic relation (= or <) or a relation negated 4 . 

The intended meaning of a sequence of statements is the term that 
shows its structure. We shall not go into details; instead, we shall give an 
example which ought to explain the idea. Given the (one-element) se¬ 
quence of statements: 

if n < 0 then skip else 

i := 0 ; 

while not n < (i + l)*(i + 1) do 
i := i + 1 

od 

ft 

we should obtain the abstract syntax tree (a Prolog term): 

if( lt( n, 0 ), skip, seq( assign( i, 0 ), 

(3.9) while( not( lt( n, V( * + ’( i, I ), ' + '(i, 1)) ) ), 

assign( i, ’ + ’(i, 1 ) )))) 

The same tree is shown in Fig. 3.4. 

4 Both parts of this example, here and in Section 3.4.1, are modelled on the illustration 
in Colmerauer’s original paper (1975). 
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Terminal symbols of our grammar are tokens (lexical units of the 
language), e.g. if, n, +, (. Variables and expressions are intentionally left 
undefined: we want to avoid too many details. A grammar for expressions 
will be discussed in Section 3.5.2. The following ten rules take care of the 
rest of language constructions. 

statements( S ) —> statement( S ), 
statements( seq( S, OtherS )) —» 

statement! S ),[*;’ ], statements( OtherS ). 
statement assign! V, E )) -*■ 

variable! V ), [ := ], expression! E ). 
statement! if( C, SI, S2 ) )-> 

[if], condition! C ), [ then ], statements! SI ), 

[ else ], statements! S2 ), [ fi ]. 
statement! while! C, S )) -» 

[ while ], condition! C ), [ do ], statements! S ), [ od ]. 
statement! skip ) -> [ skip ]. 
condition! R ) —* relation! R ). 
condition! not! R )) -» [ ’not’ ], relation! R ). 
relation! eq( El, E2 )) —* 

expression! El ), [ ’ = ’ ], expression! E2 ). 
relation! lt( El, E2 ) ) -*■ 

expression! El ), [ '<’ ], expression! E2 ). 

This grammar would probably be activated by calls such as 

... read_a_list_of_tokens! LisT ), 

phrase! statements! Structure ), LisT ) ... 

which analyse LisT and instantiate Structure appropriately, or fail if LisT 
is not a valid sequence of statements. Another possibility (not always 
practical, though) is to build—synthesize, if you prefer—a list of Tokens 
starting from a given structure: 

... take_a_structure! S ), phrase! statements! S ), Tokens ) ... 

Here, Tokens will be instantiated if only S is a proper structure. The 
grammar establishes one-one correspondence between structures and 
lists of tokens, and provides transformation both ways. 

A more realistic example of synthesis based on a metamorphosis 
grammar will be given in the next section. Here we only observe that in 
both cases (analysis and synthesis) similar computations ensue. They 
differ because, on analysis, the sequence of terminal symbols “controls” 
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the computation (i.e. determines the choice of rules) whereas, on synthe¬ 
sis, it is “’controlled” by the initial non-terminal symbol’s parameter. 


3.4. EXTENSIONS 


3.4.1. Conditions 

Grammar rules described so far correspond to clauses in which every 
call manipulates the sequence of terminal symbols, i.e. every call has an 
input and an output parameter. Other calls could be inserted in between 
without affecting the transfer of terminal symbols. The question is: Would 
it be useful, and how could it be interpreted? 

As a simple possibility , consider the cut in the first clause of list: 

m iistc r, y ), k, m)> 

terminal( K, L ), terminal( ’)’, L, M ), !. 

The cut turns the computation based on the list procedure into a “deter¬ 
ministic” process: it handles either the empty list or non-empty lists. It 
does not matter when we want to recognize a list. However, it is now 
impossible to generate lists. The command 

listf L, T, [] ), write( L ), write( T ), nL, fail, 
will only write one instance of L and T, namely 

listf ’(’, ’)’ ) and 

The gain from the cut is small in this case, anyway. Cuts would be of 
much greater use, say, in the program that parses statements (see the 
previous section), where long and deep computations may occur. 

Another example: suppose we want to change the program for pars¬ 
ing lists so that for an atom it produces a Prolog atom instead of a parse 
tree, e.g. returns 

listf ’(’, itemsf itemf big ), itemsf itemf ox ))), ’)’ ) 

for the list (big,ox). One way to do so is to make the procedure for atoms 
return a Prolog list of letters, and apply the built-in procedure pname (see 
Section 5.10) to this list 

item( itemf ATOM ), K, L ) :- 

atomf LETTERS, K, L ), pname( ATOM, LETTERS ). 

item( itemf LIST), K, L ) :- listf LIST, K, L ). 

atomf LETTER.[], K, L ) :- letterf LETTER, K, L ). 
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atom( LETTER.LETTERS, K, M ) :- 

letter( LETTER, K, L ), atom( LETTERS, L, M ). 
letter* a, K, L )terminal* a, K, L ). 


letter* z, K, L )terminal* z, K, L ). 

One final example: in the program above we shall replace the 26 
clauses that define letters by a single clause: 

letter* LETTER, K, L ) :- 

terminal* LETTER, K, L ), isletterf LETTER ). 

with isletter defined, say, as 

isletter* LETT ) :- a @=< LETT, LETT @ = < z. 

This new clause can be used as follows: 

letter* Lett, x.’)’.Tail). 

terminal* Lett, x.Tail ), isletter* Lett). 

Lett«- x, Tail«- ’)*.*)’.[] 
isletter* x ). 
etc. 

The variable in the call on terminal matches every terminal symbol. If 
the terminal symbol is not a letter, a call on isletter will fail and a letter 
will not be recognized. We call such terminal symbols variable terminals: 
the first (still unprocessed) symbol is selected and is then either accepted 
or rejected, e.g. according to the result of a test such as isletter. 

Extra calls that do not comprise input and output parameters have 
been known as conditions, but the name is slightly misleading. Only in the 
last example isletter(Lett) can be interpreted as a condition: the clause 
will only be applied if isletter succeeds. The call on pname in the second 
example is rather an action performed on the parameters of non-terminal 
symbols. Finally, the cut can be reasonably interpreted exactly as in any 
other clause, as pragmatic information on the future use of the clause. 

Conditions in metamorphosis grammars are enclosed in curly brack¬ 
ets, so that they will not be confused with terminal and non-terminal 
symbols. Examples: 

list* list* ’(’»’)’))-[ T, T ],{!}■ 

item* item* ATOM )) —* atom* LETTERS ), 

{ pname* ATOM, LETTERS ) }. 
letter* LETTER ) [ LETTER ], {isletter* LETTER ) }. 

As an exception, the cut need not be placed within curly brackets, e.g. 

list* list* T )) — [ T, ’)’ 1> !• 
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Contiguous conditions can be combined in a single pair of brackets, and in 
general a condition can also contain alternatives conjoined by semicolons, 
e.g. 

alphanum( Char ) -» [ Char J, {isletter( Char ) ; isdigit( Char )}. 

We shall now present a small fragment of a metamorphosis grammar, 
meant primarily for synthesis (but applicable both ways, although not 
without reservations). We want to take a structure computed by the gram¬ 
mar for statements (see the previous section) and produce its translation 
into a machine-oriented symbolic language. We shall only give a hint of 
the target language by showing schematic translations of whife(C, S) and 
if(C, S1 ,_S2). _ 

Let C and S be the translations of C and S. The evaluation of C sets a 
flag used implicitly by a conditional jump instruction. Let f 1, €2 be unique 
labels. The translation of while(C, S) will be 

label( d ) 
not(C) 

jumpiftrue( 12 ) 

S 

jump( €1 ) 
label( €2 ) 

The translation of if(C, SI, S2) will be 
C 

jumpiftrue( (l ) 

S2 

jump( 12 ) 
label( <1 ) 

SI 

label( t2 ) 

The “code generator” can be written as a grammar of the target 
language. By way of explanation, we shall show three of the rules that 
belong to the uppermost level of the definition: 

code( seq( S, OtherS )) -* code( S ), code( OtherS ). 
code( while( C, S )) -* 

{ newlabel( LI ) }, [ labelf LI ) ], codecond( not( C )), 

{ newlabel( L2 )}, [ jumpiftrue( L2 ) ], code( S ), 

[ jump( LI ), labelf L2 ) ]. 
code( skip )—»[]. 

The action newlabel can generate a new, unique label. The definition 
of codecond will be given below. The third rule illustrates a new feature of 
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grammar rules. If the righthand side contains no terminal and non-termi¬ 
nal symbols, nothing will be produced during synthesis and nothing will 
be “chopped off” during analysis. The underlying clause is 

code( skip, K, K ). 

Try to trace the execution of 

:- code( seq( skip, skip ), Translation, [] ). 

Assuming that coderel defines the grammar of codes for relations eq 
and It, the definition of codecond can be as follows: 

codecond( not( not( C ))) —* codecond( C ). 
codecond( not( Rel ) ) —* coderel( Rel ), [ revert(_) ]. 
codecond( Rel ) —* coderel! Rel). 

where “revert” is an instruction of the target language that resets the 
“condition flag”. 

The example would be completed after specifying the translation of 
expressions and of assignments, in particular the handling of variables. 

The code generator together with the grammar of statements might 
constitute the core of a simple compiler. Its overall structure might be: 

compile :- read_tokens( Token.list), 

parse( Token.list, Syntax-tree ), 
generate.code( Syntax.tree, Object.code ), 
write, code! Object, code ). 

with parse and generate-code defined as 

parse! T, S ):- phrase! statements! S ), T ). 
generate.code! S, O ) :- phrase! code! S ), O ). 

The procedure read.tokens, reading the source program in and perform¬ 
ing lexical analysis, might also be (partly) written as a metamorphosis 
grammar—see Colmerauer (1975, 1978). 


3.4,2. Context 

Another feature of grammar rules in Prolog is a mechanism for modi¬ 
fying the sequence of terminal symbols during the computation. In gen¬ 
eral, this would require explicit manipulations on input and output param¬ 
eters, but such general mechanisms seem only necessary in natural 
language processing (an important application of Prolog). A very re¬ 
stricted mechanism, so-called context grammar rules, is quite sufficient, 
though, in most of the other applications. 
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In a context grammar rule, the lefthand side is supplemented by a 
so-called context 5 : terminal symbols, preceding the arrow For ex¬ 
ample: 

otherst( S, S ), [ Delim ] -► [ Delim ], { stsde!im< Delim )}. 
do, [ ’not’ ] -> dont. 

The output parameter in the head of an underlying clause is appended to 
the context. As clauses, the above rules are: 

otherst( S, S, K, Delim.L ) 

terminal Delim, K, L ), stsdelimf Delim ). 
do( K, 'not’.L ) :- dont( K, L ). 

The first rule can be interpreted without resorting to the corresponding 
clauset we shall give the interpretation below. The second rule, however, 
can only be explained in terms of manipulations on sequences of terminal 
symbols: a new terminal symbol appears after recognizing an instance of 
dont, and only then is an instance of do recognized as well. We shall 
elaborate on this example a little, too. 

First we come back to the grammar for statements, In its present 
shape it performs rather poorly on incorrect inputs. It fails without giving 
any message or diagnostics. We shall try to improve the definition of 
statements, leaving the other rules as an exercise. We observe that a 
statement (other than the last) may be delimited by a semicolon (it indi¬ 
cates that there are other statements in this sequence), by else, H, or od. 
Other delimiters are erroneous. In case of errors, no meaningful structure 
may be found for the whole sequence of statements, but we elect to 
continue the analysis, after skipping a portion of input up to the nearest 
semicolon. Here are some rules of a grammar that implements these 
ideas. 

statements( Sts ) statement( St), otherst( St, Sts ). 
otherst( Stl, seq( St I, Sts )) -* 

t ], statement St2 ), otherst( St2, Sts ). 
otherst( St, St), [ Delim ] —*■ 

[ Delim ], { stsdelimf Delim )}. 
otherst( T ], erroneous( T ), 

otherstf St, St) -*■ []. % this for the last statement 
erroneous( T ) { write( bad( T ) ), nl }, skipped, 

skipped, ]. 

s Readers familiar with context-sensitive grammars will notice that neither rule is a 
proper context-sensitive rule. Even if we disregard parameters and conditions, the rules will 
only belong to Chomskian type 0. 
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skipped -»[_], skipped. 

skipped —► [J. % if we are skipping the last statement 
stsdelim( else ). stsdelim( fi ). stsdelim( od ). 

The context rule can be interpreted in the following manner: “the 
remainder of a sequence of statements is empty if we have encountered a 
proper delimiter; this delimiter is retained”. Notice that we have actually 
effected one-item lookahead on a list of terminal symbols. In general, 
we can have lookahead for any fixed number of terminal symbols, for 
example 

p, [ Tl, T2 ]— [ TI, T2 ], {test( Tl, T2 )}. 

This translates into 
p< K, T1.T2.M ) 

terminal( Tl, K, L ), terminal! T2, L, M ), test( Tl, T2 ). 

We can use p to make the test; e.g. in 
a -» p, b, c. 

p consumes no input, so that the rule is structurally equivalent to 
a-» b, c. 

but it will only be applied if two leftmost terminal symbols of the current 
sequence pass the test. 

The second example is a very simplified little grammar that recog¬ 
nizes auxiliary “do not”, “don’t”, does not”, “doesn’t”. This particular 
problem can easily be solved differently, the way we have chosen is 
intended as an illustration of context grammar rules: 

aux —* do, [ ’not’ ]. 
do, [ ’not’ ] —* dont. 
do —* [ do ]. 
do —* [ does }. 

dont-» [ ’don”t’ ]. %i.e. don’t 

dont —* [ ’doesn”t’ ]. %i.e. doesn’t 

The following computation should explain how this grammar is used: 

aux( ’doesn”t’.like.it.[], Tail ). 

do( ’doesn”t’.like.it.[l, Tl ), terminal! ’not’, Tl, Tail ). 

Tl <- ’not’.L 

dont( ’doesn”t’.like.it.[], L ), 
terminal! ’not’, ’not’.L, Tail). 
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terminaK ’doesn”t’, ’doesn”tMike.it.[], L ), 
terminaK ’not’, ’not’.L, Tail ). 

L «- like.it.[] 

terminaK ’not’, 'notMike,it.[], Tail). 

Tail *- like.it.[] 
success 

Our last example is a small grammar that discards leading zeroes from 
an integer represented as a list of digits: 

zeroes, [ D ] -* [ 0 ], zeroes, [ D ], { digit( D )}. 
zeroes —► []. 

You may wish to trace the execution of the directives 

zeroes( 0.3.[], Tail ). 
zeroes( 0.0.[], Tail ). 


3.4.3, Alternatives 

Two or more grammar rules with the same lefthand side (including 
context and parameters of the non-terminal symbol) can be combined into 
a single rule with the common lefthand side and with the righthand side 
taking the form of alternatives —a sequence of original righthand sides 
separated by semicolons. For example: 

list -» [ T, ’)’ ] ; [ ’(’ ], items, [ ’)’ ]. 

items item ; item, items. 

item -* atom ; list. 

atom —> letter ; letter, atom. 

letter -*■ [ L ], {isletter( L )}. 

Notice how—at last—we managed to come back rather closely to the 
original BNF grammar (3.1). 

The translation of a rule with an alternative into an underlying clause 
is straightforward. One example should be sufficient: 

items( K, N ):- item{ K, N ) ; 

item( K, L ), terminaK \\ L, M ), items( M, N ). 

The notation with alternatives is, strictly speaking, a "convenience” 
rather than a real extension, and—like alternatives in ordinary clauses 
(see Section 1.3.7)—it can sometimes adversely affect the grammar’s 
readability. 
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3.4.4. Syntax of Grammar Rules: Summary 

We shall now give a metamorphosis grammar that describes full syn¬ 
tax of grammar rules supported by Prolog-10. The principles of mapping 
rules onto underlying clauses have been discussed at length in the pre¬ 
vious sections, so we choose not to overburden the grammar with param¬ 
eters that would take care of the translation. However, we encourage you to 
try and augment the grammar along these lines. A hint: most of the non¬ 
terminal symbols should be given three parameters, two variables (to 
construct an input and output parameter) and a term (to hold the—partial— 
translation). For example: 

grammar_rute( ( Tr.of.Ieft Tr_of_right ), In_var, Out_var ) 

-* lefthand_side( Tr_of_left, In_var, Out.var ),[’-*’]> 
righthand_side( Tr_of_right, In_var, Out.var ),[’.’ ]. 
rule_items( ( Tr.of.item, Tr_of_items ), Curr_in_var, Out.var ) 

—> rule_item( Tr_of_item, Curr_in_var, Mid.var ), [ ], 

rule_items( Tr_of_items, Mid.var, Out.var ). 

In the actual translation we might eliminate the calls on the procedure 
terminal. Since terminal(T, K, L) means that K = T.L, we can substitute 
in advance T.L for K elsewhere in the clause. For example, in the clause 

list( K, N ) :- 

terminal( ’(’, K, L ), items( L, M ), terminal( ’)’, M, N ). 
we have K = ’(’-L and M = ’)’.N, and after replacing K and M we obtain 
list( ’(’L, N ) items( L, ’)’.N ). 

This is, in fact, what is done in many implementations (see, e.g., Section 
7.4.9). As we have executed both calls on terminal beforehand, every 
computation started by a call on list will be at least two steps shorter. 
Here are some other examples of such an improved translation of gram¬ 
mar rules: 

letter( Lett, Lett.L, L ) isletter( Lett ). 
p( T1.T2.M, T1.T2.M ) :- test< Tl, T2 ). 
zeroes( 0.L, D.N )zeroes( L, D.N ), digit( D ). 

We shall now present the grammar without parameters (it is, really, 
equivalent to a BNF definition). 

grammar.rule —► lefthand.side, [ ], 

right hand, side, [ Y ]. 
lefthand.side-* nonterminal, context, 
context —* terminals ; [J. 
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righthand.side —> alternatives, 
alternatives —> alternative ; 

alternative, [ ], alternatives, 

alternative -»[[]]; rule.items, 
rule.items -*■ ruleJtem ; rule_item, [ ], rule-items, 

rule-item nonterminal ; terminals ; condition ; [ ! ] ; 

[’(’], alternatives , [ ’)’ j. 
nonterminal -*■ name ; 

name, [’(’], list_of_terms, [’)’]. 
terminals —* [ ’[’ ], list-of_terms, [’]’]; string, 
condition —* [ ’{’ ], procedure-body, []. 
list_of_terms -»■ term ; term, [ ], iist_of_terms. 

Definitions of name, term, string and procedure-body are left as an 
exercise. 

It should be noted that the original appearance of grammar rules in the 
Marseilles interpreter of Prolog I (Roussel 1975) was slightly different. In 
particular, no alternatives were allowed, and terminal symbols and condi¬ 
tions could not be combined. Just to give the flavour of it, we shall rewrite 
in Marseilles syntax some of the grammar rules for statements (Section 
3.4.2). 

:STATEMENTS( *STS ) == :STATEMENT( *ST ) 

:OTHERST( *ST, *STS ). 

:OTHERST( *STI, SEQ( *ST1, *STS ) ) = = 

#; :STATEMENT< *ST2 ) :OTHERST( *ST2, *STS ). 
:OTHERST( *ST, *ST ) #*DELIM = = 

#*DELIM -STSDELIMf *DELIM ). 

:OTHERST( *DUMMY1, *DUMMY2 ) = = 

#*T :ERRONEOUS( *T ). 

:OTHERST( *ST, *ST ) == . 

*THIS FOR THE LAST STATEMENT. 


3.5. PROGRAMMING HINTS 
3.5.1. Efficiency Considerations 

Metamorphosis grammars correspond to Prolog programs which im¬ 
plement a very general parsing strategy: nondeterministic top-down pars¬ 
ing with backtracking (Aho and Ullman 1977; Gries 1971). The potential 
cost of this strategy is exponential. This is the disadvantage of the gener- 
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ality and ease of programming with metamorphosis grammars. Well- 
known parsing algorithms for restricted classes of context-free grammars 
can be quite conveniently programmed in Prolog without metamorphosis 
grammars. See for example the operator precedence parser described in 
Section 7.4.3 and Appendix A.3. However, this requires explicit handling 
of the parsing stack, attributes etc., while metamorphosis grammars by 
themselves are as powerful as attribute grammars (Knuth 1968) or two- 
level grammars (van Wijngaarden 1976)—see the discussion in (Pereira 
and Warren 1980). Parameters and conditions/actions make it possible to 
construct an intuitively appealing, concise and readable metamorphosis 
grammar of any existing programming language (and of reasonable sub¬ 
sets of natural languages), capturing semantics as well as syntax—see e.g. 
(Moss 1979). At the same time, such a grammar can usually be used as a 
translator of this language, without additional effort on the part of the 
programmer, but there is often a certain price to be paid in efficiency. 

One source of inefficiency is repetition. Consider two rules from the 
grammar for statements (Section 3.3): 

relation! eq( El, E2 )) —* 

expression El ),['=’ ], expression! E2 ). 

relation! lt( El, E2 ))—*■ 

expression! El ), { ’<’ ], expression E2 ). 

If a given relation is not an equality, we recognize this state of affairs only 
after parsing the first expression and failing to find an equals sign. We 
abandon the rule and choose the next but then we must once more parse 
the first expression (which may be quite large). The problem remains if we 
change the order of the rules. 

To avoid this inefficiency, we may apply factorization—the technique 
already used in Section 3.4.2: 

relation( R ) —► expression El ), op_and_expr( El, R ). 

op_and_expr( El, eq( El, E2 ))-*■[' = ’ ], expression! E2 ). 

op_and_expr( El, lt( El, E2 )) —► [ ’<* ], expression E2 ). 

Another solution is to combine the original rules into a single rule by 
replacing the terminal symbols with a variable terminal, and adding a 
suitable condition: 

relation R ) -* expression El ), [ Op ], 

{ makestruct( Op, El, E2, R )}, 
expression E2 ). 

makestruct( El, E2, eq( El, E2 )). 

makestruct( ’<’, El, E2, lt( El, E2 )). 
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Notice the position of the condition: if we placed it at the end of the rule, 
we would run the risk of discovering an improper instance of Op only 
after parsing the whole input, say, 

( A + b/2 )*c blah_ blah 2*( n - ( x + y )/4 ) 

In its present position the condition fails as soon as it sees an invalid 
operator. 

Both improvements of the original grammar eliminate possible repeti¬ 
tions. Both, though, seem to decrease the readability and elegance of the 
original solution, and we recommend that they be applied (if at all neces¬ 
sary) only in the late stages of program debugging. 


3.5.2. Elimination of Left Recursion 

We shall now discuss a problem which frequently arises with inexpert 
use of metamorphosis grammars. As an example, we shall consider the 
task of writing a workable grammar of simple arithmetic expressions (see 
Section 3.3). Here is the definition in BNF (for simplicity, we limit our¬ 
selves to two operators only): 

< expression > ::= < add_expr > | 

< expression > + < add_expr > | 

< expression > - < add_expr > 

< add_expr > ::= < constant > 

We now give an obvious transcription of this definition into a meta¬ 
morphosis grammar. Parameters are used to build the structure of a given 
expression—see (3.9). 

expression E )—* add_expr( E ). 

expression El + E2 ) —» 

expression El ),[’ + ' ], add_expr( E2 ). 

expression El - E2 ) -*■ 

expression El ),[*-’ ], add_expr( E2 ). 

The definition of add-expr will be left out (it can be simply an integer 
constant). 

Unfortunately, this grammar— ; as a program—is not only inefficient 
but also incorrect. It goes into infinite (left) recursion whenever we give it 
an expression that contains a minus. Try to analyse the expression 2-3 
+ 5 (represented by 2. , -\3.’ + ’.5.[]). 









84 3 Metamorphosis Grammars 


At first sight, it seems we can improve the situation by applying one of 
the techniques shown in the previous section. For example, the second 
technique gives the following rules: 

expression( E ) —* add_expr( E ). 
expression! E ) -» expression El ), [ Op ], 

{ makesum( Op, El, E2, E )}, 
add_expr( E2 ). 
makesum( ’ + El, E2, El + E2 ). 
makesum( El, E2, El - E2 ). 

Now correct expressions will be parsed successfully, although an expres¬ 
sion composed of n add-expressions will require n — 1 backtracks before 
reaching the solution. But the grammar will still fall into infinite recursion 
on any incorrect input (you may wish to check this on 2. +.[]). This means that 
it is of no practical value. As in all top-down parsing methods, we must 
eliminate left recursion to avoid trouble. 

Suppose we reverse nonterminal symbols in the recursive rules in 
(3.10): 

expression El + E2 ) -»■ add_expr( El ),[’ + ’ ], expression E2 ). 
expression El — E2 ) —* add_expr( El ),[’ — ' ], expression E2 ). 

Now incorrect input causes the grammar to fail (without any error mes¬ 
sage, but this can be fixed). However, this grammar interprets operators 
as right-associative. The instantiation of its parameter for the expression 
2-3 + 5 will be -(2, +(3, 5)) rather than +(-(2, 3), 5). Here is a possi¬ 
ble solution to this new problem: 

expression E ) —> add_expr( El ), rest_of_expression( El, E ). 
rest_of_expression( El, E ) —* 

[ ' + ’ ], add_expr( E2 ), rest_of_expression El + E2, E ). 
rest_of_expression( El, E ) —* 

add_expr( E2 ), rest_of_expression El - E2, E ). 
rest_of_expression El, El ) -* []. 

When we parse an expression, the parameter is initially uninstan¬ 
tiated. It is passed unchanged and instantiated after reaching the end of 
the expression. (In the terminology of attribute grammars this is a synthe¬ 
sized attribute.) The final structure is accumulated step by step. For ex¬ 
ample, during the parsing of the expression 2 — 3 + 4 - 5, rest-of^ex¬ 
pression vj\\\bt activated four times, with 2,2 - 3,(2 - 3) + 4 and ((2 - 
3) + 4) - 5 as the first parameter. (This parameter is an inherited 
attribute.) Eventually the third rule will be chosen and E instantiated to 
((2 - 3) + 4) - 5. 
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We shall now present a grammar for expressions, complete with error 
handling, that fits the grammar for statements (see Sections 3.3 and 3.4.2). 
The definition of erroneous was given in Section 3.4.2. 


expression E ) -* add_expr( El ), rest.of_expression El, E ). 
rest_of_expression( El, E ) -> 

[ *+’ ], add_expr( E2 ), rest_of_ expression! El + E2, E ). 
rest_of_expression( El, E ) —* 

[ ], add.exprf E2 ), rest_of_expression El - E2, E ). 

rest_of_expression( El, El ), [ Termin ] —* 

[ Termin ], { expr_termin( Termin ) }. 
rest_of_expression( T ], erroneous( T ). 

rest_of_expression( El, El )-> []. 
expr_termin( then ). expr_termin( else ). 

expr.termin! do ). expr.termin( od ). 

expr_termin( ). expr_termin( fi ). 

add_expr( E ) -*■ mult_expr( El ), rest_of_add_expr( El, E ). 
rest_of_add_expr( El, E )—► 

[ V ], mult_expr( E2 ), rest_of_add_expr( E1*E2, E ). 
rest_of_add_expr( El, E ) —> 

[ 7* ], mult_expr( E2 ), rest_of_add_expr( E1/E2, E ). 
rest_of_add_expr( El, El ), [ Termin ] —> 

[ Termin ], { add_expr_termin( Termin )}. 
rest_of.add_expr( T ], erroneous! T ). 

rest_of_add_expr( El, El )-*■ [j. 
add. expr_ termin( Termin ) expr_termin( Termin ). 
add_expr_termin( 
add.expr.termin( ). 

mult_expr( E ) —> variable! E ). 
mult.expr! E ) constant! E ). 
mult.expr! E )-*>[’(’ ]. expression! E ), [ ’)’ ]. 


To make the grammar really complete, we should also define vari¬ 
ables and constants. We choose not to do it, because variables require 
symbol table handling—we shall discuss it in Section 4.2.2. 

The techniques described above are only necessary if we want to 
perform analysis with a metamorphosis grammar. Even more: the trans¬ 
formed grammar is not good for synthesis, i.e. for constructing the se¬ 
quence of terminal symbols given a (correct!) structure. Specifically, for 
synthesizing expressions, the only reasonable solution would be the origi¬ 
nal grammar (3.10). 
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4.1. INTRODUCTION 


Programming in Prolog differs from programming in classical (Pascal- 
style) languages primarily at the level of individual procedures. The larger 
the program, the more suitable the general recommendations of program¬ 
ming methodology. The advantages of systematic top-down design of 
programs, modularity 1 , clean interfaces, etc., are certainly independent of 
the programming language used. Design and coding techniques specific to 
Prolog are due to its logical origin. 

In Section 1.3.4 and Chapter 2 we discussed logical—static—inter¬ 
pretation of procedures. This interpretation makes it possible to design 
programs without paying attention—at least initially—to how the compu¬ 
tation will proceed. One only needs to indicate what will be computed. 
Kowalski (1974, 1979a) coined an “equation”. 

Algorithm = Logic + Control 

which helps clarify the distinctive feature of logic programming. It is 
maintained that logic programming relieves the programmer of the burden 
of specifying control information for her program. One would like to say; 
completely relieves, but unfortunately (at least in Prolog) this is not the 
case. Many useful built-in procedures, such as the cut, input/output and 
program modification procedures (assert, etc.; see Section 5.11), cannot 
be interpreted statically. As a result, a practical program may not usually 
be designed without paying regard to control information, 

1 At Jeast on a conceptual level: most existing Prolog implementations do not support it 
explicitly. 
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In Section 4.3 we shall briefly consider the advantages and disadvan¬ 
tages of some side-effects in Prolog; we shall also present several simple 
tricks that help increase the efficiency of Prolog programs (especially their 
space requirements) in many existing implementations. Earlier, in Section 
4.2, we shall give a few examples of Prolog implementation of commonly 
used data structures, in particular binary trees and linear lists. We shall 
show basic operations on those structures and a few typical applications. 
Section 4.4 contains small examples of program design. 


4.2. EXAMPLES OF DATA STRUCTURES 


We have chosen unbalanced binary search trees (BSTs) and one-way 
linear lists as an illustration of methods of implementing recursive data 
structures in Prolog. We assume you are familiar with basic definitions 
and algorithms; a detailed, though rather elementary presentation can be 
found, for example, in Wirth (1976) or Sedgewick (1983). Here, we shall 
refer only to common intuitions, and we shall concentrate on problems 
specific to Prolog. 

We shall also briefly discuss representation of data structures by 
clauses—in particular, Prolog counterparts of arrays. 

4.2.1. Simple Trees and Lists 

Terms can usually be regarded as trees: the main functor labels the 
root, subtrees correspond to arguments. This is slightly imprecise, be¬ 
cause multiple occurrences of variables represent more general struc¬ 
tures—directed acyclic graphs (DAGs). However, the term f(A, A) 
which should be depicted as 



can be thought of as 



A 


A 
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We must only remember that the two subtrees will remain identical, so 
instantiating variables in one will affect the other. Another difficulty is 
that it is possible to compute terms which are not even DAGs, and which 
should therefore be regarded as corresponding to infinite trees (see Sec¬ 
tion 1.2.3). All the same, an ordinary tree is a good intuition of the (gen¬ 
eral) term. 

Terms are a convenient and concise representation of trees with irreg¬ 
ular structure, where the information in the nodes determines both the 
shape of the tree and the repertory of applicable operations. The abstract 
syntax tree of Fig. 3.3, Section 3.3, is a typical example. However, pro¬ 
grams that manipulate such irregular structures are usually problem-de¬ 
pendent, in that every principal functor (i.e. every type of node may 
require different computations). 

There are other situations, typified by binary search trees, when we 
need a more uniform representation, because we use trees for contents 
rather than for structure. Suppose we represent the BST of Fig. 4.1 as the 
term 

few(people(many(languages), speak)) 

Even if we disregard the ambiguity (is "languages” the left or right de¬ 
scendant of "many”?), main functors and their arguments must be iso¬ 
lated, that is, we must use the built-in procedure = .. ("univ”; see Section 
5.10). To modify the tree, e.g. by adding a node, we must rebuild it com¬ 
pletely, also using univ. This is not only inelegant, but inefficient as well 
(but see Section 4.2.6 for a discussion of such techniques). 

We shall therefore represent empty binary trees by the atom 

nil 


few 








90 


4 Simple Programming Techniques 



and nonempty trees by three-argument terms 

t( Left-subtree, Node_info, Right-subtree ) 

For example, the BST of Fig. 4.1 will be represented by the term 

t( nil, few, t( t( t( nil, languages, nil), many, nil), 
people, t{ nil, speak, nil))) 

The term can be drawn as a tree (see Fig. 4.2). This method of represent¬ 
ing binary trees can be readily adapted to trees of a different fixed degree, 
e.g. non-empty ternary trees can be represented by four-argument terms 

tt( Node-info, Left_subt, Middle_subt, Right_subt) 

In Fig. 4.2 the contents of each node is only a key, but of course in 
practical applications nodes contain other information as well. The tree 
shown in Fig. 4.3 holds names and phone numbers of several persons— 
names are keys in lexicographic order. We use a nonassociative infix 
functor to separate keys from other data. 

An inorder traversal of a BST visits the nodes in increasing order, 
according to the ordering relation in the set of keys. For example, the 
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FIG. 4.3 Another BST. 

following procedure can be used to write out name-phone pairs, sorted 
alphabetically by names: 

writensortedC nil > ■ 

writG_»sor t<*d ( ti Lef t_.subtre&p Nod info? Rights sub tree > > f — 
wri tensor ted ( Left..sub tree! if 
writer ( Node^info) r nl r 
wr i t n\,„ f*or1 1 ? d ( R i s h t^sub t r e c» > . 

In this procedure, we need not test the actual ordering of nodes; this 
would not be the case if we wanted, say, to locate a node in a tree. Let the 
call 

precedesC Node!, Node2 ) 

succeed iff Nodel comes before Node2. For our name-number pairs the 
procedure can be defined simply as 

precedes( Name I Name2 Name! (cv< Name2. 

It is reasonable to expect that nodes are correctly built, e,g, that each 
key is a name, and other information a number, A good place to check this 
would be a procedure for inserting a node into a tree: 
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insert( Node, Tree, Newtree ) 

correct( Node ), !, ins( Node, Tree, Newtree ). 
insert( Node, signal_error( Node ). 

However, such defensive programming is seldom necessary in practice. 

The insertion procedure ins is rather straightforward. We must only 
take care to preserve the ordering relation: 


'/, an (smata tree will be replaced by a new leaf 
Node*? nil* t ( nil* Node?* nil ) > ■ 
irvs( Node* t( L<sft* Root* Flight)* tC Newleft* Root* Risht > > 
precedes* Node* Root )* ins< Node* Left* Newleft >. 
ins< Node* t( Left* Root* Risiht) * t( Loft* Root* Newrisht 1 > : — 

precedes ( Root * Node )* Node* Risht* Newrisht ). 


The procedure fails when it tries to duplicate a key (both calls on precedes 
fail). If the keys need not be unique, we must relax one of the tests, e.g. 
by changing 

precedes{Root, Node) 

into 

not precedes(Node, Root) 

A BST can be built by successive insertions. We shall not discuss 
balanced trees. They present problems of their own, which can be solved 
by far in the same way as in classical programming languages (see e.g. 
Sedgewick 1983) but which can cause memory problems with some Pro¬ 
log implementations. One example is an AVL-tree insertion program (van 
Emden 1981, Vasey 1982). 

We need some thought to delete a node even from an unbalanced tree. 
If either of the subtrees of the deleted node is empty, the other subtree 
moves up and replaces the node. For example, deleting adams : _ from 
the tree in Fig. 4.3 gives the tree in Fig. 4.4. Suppose now that both 
subtrees are nonempty; we shall preserve the ordering if we replace the 
deleted node by that with the largest key in the left subtree (or else that 
with the smallest key in the right subtree). For example, deleting 
thompson : _ in Fig. 4.3 gives the tree in Fig. 4.5. 
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t 



The following procedures implement this algorithm. The second 
clause is for symmetry (and for efficiency) but it is not really necessary. 

del ( Node* t< nil * Node if Right >* Right ). 
del ( Node* t< Left* Node* nil >* Left)- 

del ( Node* t< Left* Node* Right )* tC Newleft* Lefimax* Right ) ) a 
remove-max< Left* Left max* Newleft >„ 
del C Node* t{ Left* Root* Right )* t( Newleft* Root* Right > ) 2 - 

p recedes < Node* Root >* del ( Node* L.eft* Newleft >. 

delC Node* t< Left* Root* Right )* tC Left* Root* Newright ) > 

precede*:! ( Root* Node )* del < Node* Right* Newrisht ) » 

X find and remove the node with the largest key 
remove-max( t< Left* Max* nil )* Max* Left ). 

remove_max( t< Left* Root* Right )* Max* t< Left* Root* Newrisht ) > 
remove-max< Right* Max* Newrisht >- 

Normally we would call the procedure del with only the key given. 
We might encapsulate such calls: 

delete( Key, Oldtree, Newtree ) 

del( Key : _, Oldtree, Newtree ). 
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The last basic operation on BSTs is the search itself ; 

search < Node? t, C +~r Node? - ) )- 
search< Nader ti Left* RooL? ) > :- 

precedes ( Nader Root )* search( Nader Left* >- 
search< Nader tC Root* Right > > = ~ 

firec:edc?s( Ract r Node ) r search C Node t Right). 

Again, we can encapsulate typical calls—“find information associated 
with a given key”: 

find( Tree, Key, Data )search( Key : Data, Tree ). 

A slightly different method of representing binary trees consists in 
using 

1( Node ) 

for leaves, instead of t(nil, Node, nil). However, with this representation 
we would have to distinguish empty trees from leaves of non-empty 
trees. For example, two more clauses would be necessary in the proce¬ 
dure for tree insertion. 

As a very special case, we can consider trees of degree I, that is, lists. 
Recall that a widespread convention (introduced in Chapter 1) is to denote 
empty lists by the atom 


l] 
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and non-empty lists by infix terms 
Head. Tail 

The period is used to build trees of degree 2, which are a convenient 
representation of lists. It plays the same role as t in our BST example. In 
Prolog-10 a special notation has been invented as yet another application 
of syntactic sugar. It is very commonly used, even though its advantages 
over dot notation are debatable. Instead of Head.Tail we shall write 2 

[ Head | Tail ] 

the list a.b.c.Tail will be written as 
[ a, b, c | Tail ] 
and the list a.b.c.[] as 
[ a, b, c I 

To make sure you have mastered this notation, check that [c | [d]] is the 
same as [c, d]. 

We shall remind you of two list-manipulating procedures from Chapter 1. 
Membership: 

member( Item, [ Item | Tail ] ). 

member! Item, [ _ | Tail )) :- member! Item, Tail). 

And list concatenation: 

append! [I, Second, Second ). 

append! [ Head | First_tail ], Second, [ Head | Third-tail ]) 
append! First-tail, Second, Third_tail). 

Here is another small example of operations on lists. Consider the 
following simple-minded sorting algorithm: given a list, put all its mem¬ 
bers in a BST and then apply the procedure writesorted, defined above. 

sortt List ) *" buildtreut Littf nil* Tree ui*iifc_sortird( Tree 

X 2nd ^nd 3rd a rdu««n 1 t the tree built f the final tree 
buildtreet CD* Final tree* Final tree )* 

buildtreeC Clten J Itensl* Currenttree t Final tree 1 J- 
inserts Itemi Current l ree» Nexttree >i 
bull direct Items* Nexttree* Final tree )* 

2 Sometimes an equivalent notation is used: [Head,., Tail I, with written without 
blanks. 
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In Section 4.2.3 we shall define a more useful sorting procedure based 
on BSTs. It will construct the sorted permutation of a given list. 

Just as in other programming languages, lists are used in Prolog pri¬ 
marily to represent sequences and sets. They can also be used in a stand¬ 
ard way to represent trees of unspecified degree. For example, the tree of 
Fig. 4.6 might be represented by the list 

[a,[b,Ie]Uc],[d,'[fUg]]] 

Lists are best utilized when items are processed sequentially from left 
to right, or when all processing takes place at the beginning of the list. In 
the latter case the list is used as a stack. The basic stack operations, push 
and pop, can be easily written in one procedure, e.g. 

stack_op( Top, Rest_of_stack, [ Top | Rest_of_stack ]). 
with the call 

stack_op( Newtop, Stack, Newstack ) 
serving as push, and the call 

stack_op( Top, Newstack, Stack ) 

to execute pop. However, in practice we would rather operate on the stack 
implicitly , by using appropriate terms in clause heads. One example is the 
procedure reduce (see Section 7.4.3) with old and new stacks as 
parameters. The clause 

reduce( [ br( r, ’()’ ), t( X ), br( Iid< I ) | S ], 

[ t( tr( 1, X ) ) | S ] ). 

describes an action that consists of four pops followed by one push* 

Nonsequential access to a list requires, as might be expected, time 
proportional to the list’s length* To build a list in linear time, we can 
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successively push incoming items, but the original sequence will be re¬ 
versed. Alternatively, we can use append to preserve the original order of 
items, but this would square the running time. Moreover, each call on 
append entails not only a traversal of the entire list, but also creation of its 
copy. Strictly speaking, a series of variables is produced and instantiated 
to successive tails. When executing the call 

append( [Itl, It2 J, [ It3 ], X ) 
the following instantiations take place: 

X +- [ Itl | Third-tail' ] 

Third-tail' «- [ It2 | Third-tail" ] 

Third-tail" «- [ It3 ] 

As a result, only the top-level structure is copied. The situation is roughly 
as in Fig. 4.7: the two lists share all items but the last. 

We had a similar situation in the tree insertion procedure. Check 
that Fig. 4.8 properly illustrates the picture after inserting turner : 6481 
into the tree of Fig. 4.3: we copy the top-level structure of the whole 
branch. 

Copying structures upon modification is necessary because of the 
semantics of the operations: when we call append(Ll, L2, L3) to concat¬ 
enate LI and L2, we may wish to preserve an unmodified LI. If we want 
destructive modification operations, we must express this desire ex¬ 
plicitly. 



FIG, 4.7 The result of appending two lists. 
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FIG, 4,H The result of insertion into u BST, 


4.2,2, Open Lists and Trees 

If we want to build lists efficiently, we must avoid copying longer and 
longer initial segments of the final list. Recall how append extends the list 
piece by piece. After the call 

appendf [ Itl, It2 ), [ It3 ], X ) 
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BEFORE 



AFTER 



we get 

X «- [ Itl | Third_tait' ] 

Third-tail' <- [ It2 | Third-tail" ] 

and finally bind Third-tail". The trick is to keep Third-tail" ready for a 
subsequent instantiation: 

Third-tail" <- [ lt3 | Third-tail'" ] 

The situation will be roughly as in Fig. 4.9. Figuratively speaking, we 
shall be able to resume append in the next step of computation. We only 
need to get hold of the variable Third-tail'", instantiate it: 

Third-tail'" <- [ It4 | Third-tail"" ] 

and so on. When we are through, we can instantiate, say, 

Third-tail"” «- [] 

and come up with the final instance of X, 

[ Itl, It2. It3, It4 ] 
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We shall illustrate this with a procedure that reads in a sequence of 
letters (up to the first non-letter) and puts them in a list: 

read_letters( [ L | Tail ]) 

lastch( L ), letter( L ), !, rch, read_letters( Tail). 
read_letters( []). 

(See Section 5.7.4 for the description of lastch and rch.) 

The last tail variable can be left uninstantiated. Although the resulting 
structure will not be a proper list, it will be equally good as a representa¬ 
tion of sequences. We shall call such structures open lists, and to avoid 
confusion we shall call proper lists, with [] at the end, closed lists. Empty 
open lists will be uninstantiated variables. 

We must exercise some care if we deal with open lists. Consider the 
procedure that extends a given list by instantiating its tail variable: 

extend( List, Ext) :- var( List ), List = Ext. 
extend( [ _ | Tail ], Ext) :- extendf Tail, Ext). 

For example, after the call 

extend( [ a, b | V ], [ c, d | W ]) 

the first parameter becomes [a, b, c, d | W], 

It is essential that the instantiation of the tail variable be delayed. 
Consider what would happen if we changed the first clause to (apparently 
equivalent) 

extend( Ext, Ext). 

The result of the call 

extendf [ X, Y, Z | Endl ], [ a, b | End2 ] ) 

(i.e. the first parameter’s instantiation) would be [a, b, Z | End 1] instead of the 
expected [X, Y, Z, a, b | End2]. 

Procedure extend can reasonably be used only in strictly deterministic 
fashion, Failure after a successful computation causes dummy elements to be 
inserted after the first list. For example, the calls 

extendf [ a | El J, [ b | E2 ] ), fail 

instantiate El as [b | E2], (_, b | E2], b | E2], etc. Therefore a more 
reasonable version would be that with a cut at the end of the first clause. 

The reasoning that has led us to open lists can also be applied to trees. 
Uninstantiated variables represent empty open trees. Non-empty open 
trees will be represented as before. For example, the following term rep¬ 
resents the tree of Fig. 4.1 (El, ..., E6 are distinct variables): 
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t( El, few, t( t{ t( E2, languages, E3 ), many, E4 ), 
people, t( E5, speak, E6 ) )) 

Again, we shall refer to trees discussed before as closed trees. 

We need not copy anything to insert a node into an open tree. We can 
go down the appropriate branch, locate a suitable empty tree, i.e, a vari¬ 
able, and instantiate it to a new leaf: 


ins< Node* iiftmty > s “ 

vcir ( Empty )? Empty ® t< fc"lr 
in^< Node?? tC Left* Root? — ) ) s ~ 

precedent Noder Root )r ins( 
ifis( Node?j t< „ f Root r Right > ) s_ 
e> r e?d g s ( Root» Node > * insi C 


Nodcf? E 2 )* 


Node? Left ). 


Node? Risht >- 


If we rewrite the first clause as 

ins( Node, t( El, Node, E2 )). 

a subtle change in the procedure's behaviour will ensue. The procedure 
will insert nothing if this Node was already present in the tree. Surpris¬ 
ingly it will also be identical 3 to the procedure search from the previous 
section, and (as might be expected) will serve almost the same purpose. 
The overall effect of this insertion/search procedure can be described as 
follows. It looks for a given Node and succeeds after finding it. However, 
if there is no matching node in the tree, the procedure inserts Node and 
then “finds” it as well. 

There are some strikingly elegant applications of this. A well-known 
example is maintenance of symbol tables for translators written in Prolog. 
If the translated language is not block structured, a symbol table usually 
cannot contain duplicate entries, and it normally only grows, so that 
keeping it in an open tree will require no copying at all. 

The example we are going to present is, of necessity, rather involved. 
Before we proceed, you might find it helpful to return to Sections 3.3 and 
3.4.1, where we described a simple Algol-like language and sketched a 
parser and a code generator. 

We intend to produce object code for a single-address target machine. 


J The only difference is strictly technical: in some Prolog implementations dummy 
variables cannot be used to pass information, so we must insert a leaf with fresh named 
variables. 









102 4 Simple Programming Techniques 


For simplicity, we assume the code will not contain external references 
(we shall also not attempt any optimisations). 

The code generator’s output should be a list of “symbolic” instruc¬ 
tions—terms described schematically as 

Opcode( Address) 

Each Address is an uninstantiated variable. There should be a unique 
Address for every addressable symbol of the source program (variable, 
constant, label), and for every label created by the code generator. By 
way of explanation, we give a possible translation of the assignment 

x:=x + y*y + 2 

—most opcodes have obvious meaning. 


[ load( AI ), 


store( A2 ), 


load( A3), 


mult( A3 ), 


add( A4 ), 


add( A2 ), 


store( A1 ), 


stop( _), 


label( Ai ), data( . ), 

% X 

label( A2 ), data( _ ), 

% temporary 

label( A3 ), data( _ ), 

% y 

label( A4 ), data( 2 ) 

% constant 2 


] 

We want the same Prolog variable for all occurrences of a source variable; 
for example, A1 always stands for x. 

To assemble this section of code, we should determine the base ad¬ 
dress and go down the list, counting bytes (or other units of storage). 
Each executable instruction would be assigned a final address. The pseu¬ 
doinstruction label would be treated differently. We would instantiate 
Address as the current value of the location counter (without advancing 
the counter); this would instantiate all occurrences of Address (or of 
variables bound to it, if one wants such fine distinctions). Assuming each 
instruction takes four bytes and the fragment of code starts at location 
1000, we would obtain 

1000 : load( 1032 ), 

1004 : store( 1036), 

1008 : load( 1040 ), 

1012 : mult( 1040 ), 
etc. 
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Conveniently enough, all we need to achieve this remarkable beha¬ 
viour is the procedure ins (it should have rather been christened 
table-lookup). Whenever the translator encounters a symbol, say x, in 
the source program, it allocates a fresh variable V, to represent the sym¬ 
bol in subsequent processing. It also calls ins to locate or place the pair 

p( x, V ) 

in the symbol table. On the first occurrence of x the pair will actually be 
inserted. A subsequent "insertion” of p(x, U) only binds V and U to¬ 
gether, i.e. finds x’s "symbolic address”. 

For this scheme to work properly, each non-terminal symbol in the 
grammar that implements our code generator (see Section 3.4.1) must be 
furnished with one additional parameter to pass the symbol table' 1 . The 
whole grammar should be called with an empty table: 

generate.code( S, O ) :- 

phrase{ code( S, SymTab ), O ). 

And here is a rule that might be used to generate code for assignments: 

code( assign! Name, Expr), SymTab ) —*■ 
codeexpr( Expr, SymTab ), 

% code for this arithmetic expression, 

% the value will be left in the accumulator 
[ store( Addr ) ], 

{ins( p( Name, Addr ), SymTab )}. 

Symbol tables can also be implemented in open lists. For short tables 
lack of overhead due to key ordering tests can outweigh the loss due to 
worse performance. The simplest lookup procedure for open lists can be 
written as follows: 

lookup( Entry, [ Entry | Tail ] ). 

lookup( Entry, [ _ | Tail ] ) :- lookup( Entry, Tail ). 

This procedure, and two other versions (a bit more sophisticated) have 
been used in the Prolog part of ToyProlog implementation (see Section 
7.4, Appendices A.2 and A.3), and in the program described in Sec¬ 
tion 8.2. 

Open lists were first used in the bootstrapped Prolog interpreter from 
Marseilles (Battani and Meloni 1973, Roussel 1975). The technique shown 
in the code generator example was presented by Colmerauer (1975, 1978). 
Open trees were introduced by Warren (1977b, 1980b). 


4 For simplicity, we omitted the symbol table while developing the parser. We can save 
(his particular program by doing symbol table management in the back-end, but of course the 
more proper way is lo install symbols in the table in the front-end. 
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4.2.3. Difference Lists 5 

If the application does not require shortening a list, open lists can be 
constructed with no copying whatsoever. Successive instances of the 
originally empty list—a variable—are longer and longer open lists (as¬ 
suming, of course, that we are careful to instantiate final variables appro¬ 
priately). However, each time we add an item, the list must be traversed 
to find the final variable. To avoid this, we can keep this variable ready for 
instantiation: 

End = [ Newltem j NewEnd ] 

and make NewEnd available for further processing. 

The pair consisting of a list and its final variable can be considered 
another representation of the list—a little redundant for the sake of effi¬ 
ciency. It is reasonable to represent the term as a single term. We shall 
write it as 

OpenList — ItsFinal Variable 
with - a nonassociative infix functor. For example: 

[ a, b | X ] - X 

To add an item at the end of a list we use the procedure 
additem (Item, List -- [Item | NewEnd], List - NewEnd). 

The call 

additem( 4, [ 1, 2, 3 [ X ] -- X, NewList ) 
instantiates, as expected, 

NewList«- [ 1, 2, 3, 4 | NewEnd ] -- NewEnd 
because 

X «- [ 4 | NewEnd ] 

Consequently, the old list becomes 

[ 1, 2, 3, 4 | NewEnd ] — [ 4 j NewEnd ] 

To get a new list, we had to destroy the old one. 

Fortunately, the destruction is apparent. The pair can still be re¬ 
garded as a representation of the sequence 1, 2, 3. Notice that [4 | 


J Difference lists (d-lists) were introduced by Clark and Tamlund (1977). 
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NewEnd] is a tail of [I, 2, 3, 4 | NewEnd]. The sequence consists of those 
items we must pop off the first list to get its tail, i.e. of items by which the 
two lists differ—hence the name of this data structure: difference list 
(d-list for short). 

Actually, a pair consisting of an open list and its tail is only a special 
case: a difference list is defined as a pair X — Y such that X = Y or X = 

[A.. | Y] for some n a: I. In general, no restrictions need be placed 

on the form of Y, although the most interesting applications of difference 
lists are those where Y is an open list. 

Difference lists can be used to advantage whenever activity is ex¬ 
pected at both ends of the sequence, e.g. when it is used as a queue. The 
procedure additem enqueues an item. To dequeue an item, we can use the 
obvious 

remitemf Item, [ Item | List ] -- End, List -- End ). 

but the behaviour of this procedure is unsatisfactory for empty lists. The 
call 

remitem( Item, E — E, NewList) 

instantiates NewList as List -- [Item | List], i.e. as a “negative difference 
list” 6 . A procedure which fails, given an empty list, may be written as 
follows: 

removeitemf Item, List -- End, NewList — End ) :- 
not List = End, List = [ Item | NewList ]. 

Another nice feature of difference lists is the way they can be concat¬ 
enated. Suppose we have two lists: 

[ a, b | X ] -X and [ c, d, e | Y ] - Y, 

and we want to compute a list holding the sequence a, b, c, d, e. If we can 
assure that 

X = [ c, d, e | Y ], 

we shall have [a, b | X] = [a, b, c, d, e | Y], and 
[ a, b | X ] - Y 

will be a solution. This is readily generalized as a procedure: 
d_conc( Listl — Tail 1, Taill — Tail2, Listl — Tail2 ). 


6 This structure can be very useful in its own rights; see Shapiro (2 983b, Section 4.S). 
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Once again, it must be stressed that modification of such lists is destruc¬ 
tive. For example, the second call below fails, because [c, d, e | Y] does 
not match Ip | Z\. 

d_conc( [ a, b | X ] - X, [ c, d, e | Y ] - Y, ABODE ), 

d_conc( [ a, b | X ] - X, L P | Z ] - Z, ABP). 

We now return to the sorting algorithm based on BSTs (see Section 
4.2.1). Instead of traversing the tree, built of a given list, and merely 
writing out the nodes, we would rather traverse it in order to construct the 
sorted permutation of the list: 

tree_sort( List, SortedList) :- 
buildtree( List, nil, Tree ), 
buildlist( Tree, SortedList). 

The procedure buildlist “flattens” the tree (see Fig. 4.10 for an exam¬ 
ple). The general outline of the algorithm is rather obvious: we flatten the 
subtrees (recursively) and concatenate the resulting lists together with the 
root in between. Difference lists can be used to avoid numerous appends. 
Let the results of recursive calls be denoted by 

LFIat - LFIatE and RFIat - RFIatE 
The algorithm is programmed as follows: 

flatten( nil, X — X ), 

flatten( t( L, Root, R ), Flat) 
flatten( L, LFIat -- LFIatE ), 
flatten( R, RFIat — RFIatE ), 
d_conc( LFIat - LFIatE, ( Root j X ] - X, A ), 
d_conc( A, RFIat -- RFIatE, Flat ). 



FIG. 4,10 {d} A tree. The tree flattened. 
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This version is good for didactic purposes. Actually, we know that 
the following instantiations take place: 

LFlatE [ Root | X ], A «- LFIat - X, 

X «- RFIat, Flat «- LFIat - RFlatE 

We can remove both calls on d-conc and end up with an equivalent form 
of the second clause: 

flatten( t( L, Root, R ), LFIat — RFlatE ) :- 
flatten( L, LFIat — [ Root | RFIat ]), 
flatten( R, RFIat — RFlatE ). 

We might similarly derive a “short cut” clause for leaves. We begin with 

flatten( t{ nil, Root, nil ), LFIat — RFlatE ) :- 
flatten( nil, LFIat — [ Root | RFIat ] ), 
flatten( nil, RFIat — RFlatE ). 

then make LFIat = [Root J RFIat] and RFIat = RFlatE, and remove the 
recursive calls. The special case becomes: 

flatten{ t{ nil, Root, nil), [ Root | RFlatE ] — RFlatE ). 

(as expected!). 

After the call 

flatten( Tree, List -- List End ) 
we shall have List instantiated as 
[ Nodei, .... Node,, | ListEnd ], 

and all we shall need to get SortedList is close List by binding ListEnd to 
[]. This is easily achieved by defining 

buildlist( Tree, SortedList ) :- 
flatten( Tree, SortedList — [] ), 

(or replacing the buildlist call in tree sort, for that matter). 

See Section 7.4.1 for a little more sophisticated application of differ¬ 
ence lists. 


4,2,4, Clausal Representation of Data Structures 

A Prolog procedure built of unit clauses is a natural representation of 
sets and sequences. Under the static interpretation of programs, such a 
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procedure models a relation, i.e. a set of tuples for which a certain rela¬ 
tionship holds. For example: 

name_phone( thompson, 2432 ). 
name_phone( adams, 5488 ). 
name_phone{ white, 2432 ). 
name_phone( mcbride, 1781 ). 

In practice, unit clause procedures are sequences rather than sets, in 
that they are accessed sequentially. It is therefore possible to represent a 
list by a procedure, e.g. 

Iist( b ). 
list( k). 

Iist( q ). 
list( y ). 

The call 

list{ X ) 

tests membership for instantiated X, and serves as a generator for unin¬ 
stantiated X. The whole list can be processed thus: 

process-list :- list( X ), process_item( X ), fail, 
process-list. 

In general, clauses may be used to represent multidimensional matrices— 
we shall discuss this briefly in the next section. 

The restriction to unit clauses is not essential. The clause 

name_phone( X, 4396 ) :- office! X, room 119 ). 

will generate tuples one at a time, exactly as the other four clauses do. 
It is worth emphasizing that explicit and generated data are functionally 
indistinguishable. If five people sit in room 119, we can get up to nine 
name-phone pairs, without ever becoming aware of the “indirection” in 
one of the clauses. 

Any structure expressible in terms of relations can be naturally cast in 
clauses. For example, a tree can be described as follows: 

t( nodel, node2, thompson : 2432, node3 ). 
t( node2, nil, adams : 5488, node4 ). 
t( node3, nil, white : 2432, nil). 
t( node4, nil, mcbride : 1781, nil ). 
root( nodel ). 
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In particular, we can represent a list in this way: 

1( item 1, b, item4 ). 

](item2, y, nil). 

1( item3, q, item2 ). 

1( item4, k, item3 ). 
head( iteml ). 

In general, every graph can be expressed as a unit-clause procedure. By 
way of explanation, here is a possible representation of the graph of Fig. 
4.11 (see also Fig. 3.1): 

edge( el, e2, o ). 
edge( el, e2, letter ). 
edge( el, e3, atom ). 
edge( e2, e3, x ). 
edge( e2, e3, letter ). 
edge( e2, e3, atom ). 

And a representation of the graph of Fig. 4.15 (Section 4.4.3): 

arc( a, b ). arc( a, c ). arc( b, c ). arc( b, d ). 

arc( b, e ). arc( c, d ). arc( c, e ). arc( d, e ). 

Clausal representation of trees, lists and the like is rather less conven¬ 
ient than representations described in previous sections. It cannot be 
passed as an actual parameter, so that its use can only be recommended 
when the bulk of data remains unchanged (see Section 8.1 for a non-trivial 
example). Since variables are local in clauses, clever techniques shown in 
Section 4.2.2 are hardly applicable here. To build and modify data dynami¬ 
cally (e.g. add a node to a tree), we must apply “extralogical” built-in 
procedures assert, retract, etc., to the detriment of static interpretation of 
programs. 

There are advantages, too. First of all, in Prolog implementations 
which support clause indexing, direct access to components can be possi¬ 
ble. Indexing consists in finding matching clauses by hashing rather than 
by linear search, so that e.g. a node in a “tree” with n nodes can be 
located in constant time rather than in login steps (on the average). 


atom 



el e2 e3 
FIG* 4.11 A graph. 
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DEC-10 Prolog was the first to offer this possibility. If absent, it can be 
mimicked by means of the built-in procedure =.. (see the next section). 

Clausal representation sometimes helps reduce the problem at hand 
to its bare essence. A case in point is an amazingly concise solution to a 
map colouring problem; we quote it after Pereira and Porto (1980b). A 
planar map is to be coloured with at most four colours so that contiguous 
regions are coloured differently. First we define the contiguity relation for 
colours: 


next( red* bl ue ) - 
nextC blue? red )- 
next( sreen * red >- 
nextC yellow* red ) 


next < red* sreen )«. 
nextC bluet green >. 
nextC green* blue >- 
next < ye] low* bl ue > 


next < red * yellow )■ 
n ext( blue * yellow >- 
nextC green* yellow > 
nextC yellow* green ) 


The original map of Pereira and Porto (3980b) is shown in Fig. 4.12. A 
region is represented by its colour—this decision makes the solution 
beautifully terse. To find a colouring (if any) of the map, we must only cal! 


s- next< Rl, R2 >. nextC Rlt R3 >, 

next( R2t R3 ), nextC R2, R4 >r 

nextC fv'3, R4 >* nextC R6 ), 

writeC CR1> R2, R3 p R4> R5t R6) 


next( Rl* R5 
nextC R2* R5 
nextC R5* R£» 
) * nl . 


> t 
) * 

> * 


nextC Rl* R6 )* 
nextC R2* R6 >* 


Structures represented by terms are usually traversed and manipu¬ 
lated by recursive procedures. Clauses are traversed by backtracking, 
either implicit (e.g. in the call above), or explicit (e.g. in the procedure 



FIG. 4.12 A map to be coloured. 
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process-list). There is a fundamental discrepancy between these two 
modes, because backtracking destroys variable instantiations which are 
essential to recursive operations on data structures. Consider the task of 
computing a list of arcs exiting vertex b of the graph in Fig. 4.15. Arcs are 
available one at a time to a routine that “backtracks through” the proce¬ 
dure arc. If we want them to survive backtracking, we must “put aside”, 
i.e. assert, those which contain b: 

put—aside? *- arc: * Xt Y ) r -b< X? Y ) t fail- 


put-aside-if_b * Xi Y ) s- hast hr Xt Y ) f assert < with-b < X* Y > >. 


has* Xt X* _ ) =- 

has( X, -f X >. 


s- put_asid&. 


t How? a list can be creatc?d as follows" 

col1ect—with—b* ThisListr FinalList ) 
retract* with-b( Xi Y ) )j 

collect-with_.b < t <Xr Y) 1 ThisListJr Finaltist > - 
col 1 ect-with_t>* FinalList? FinalList >. 

s™ col 1ec t_with _h * 12* TheList)> write* TheList >? nl- 

Such operations are usually cast in terms of a general-purpose proce¬ 
dure that finds a set of all items for which a given condition holds. In our 
example, items would be (X, Y), and the condition 

( arc( X, Y ), has( b, X, Y )) 

The set is represented by a list, possibly with repetitions, so that it is 
called bag in the folklore. Here is our version of the procedure: 
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toaaofCItemy Condition r s- 

asser t < 9 E*AG 9 { 9 BAG 9 ) > r 
Condition* 

assert ('BAG 1 ' (Item) > 7 
fail. 

bagof (-9 _ r Bay) 

retraci( 9 BAG p (Itern))* ! * 

col1ect<Itom t Z 3 r Bay)- 


X a marker 

X saneratos an instance of Item 
X saves it 

X this clause eventually fails 

X set the last Item saved 


collect< T BAG *r FinalBasr FinalBas) s- !. X this was the marker 

collect(Item j Thi&BaSf FinalBay) t- 
retract('BAG'(NextItern))7 ■> 

collect(NextItem* CItem ! ThisBasfly FinalBay). 

The marker enables us to use the procedure bagof within Condition. 

An example ot such nested computation is the following pair of calls 
(Graph is to be a list of “bunches”—lists of arcs entering or leaving a 
given vertex; the condition in the first call is an alternative, in the embed¬ 
ded one a conjunction): 

5 - bagof< X7 < arc <X* *,) F arc(_7 X) )? Vertices ) * 
bayof( Bunch? 


( member (U? Vertices) 7 
basof( (Yr Z>7 


Bunch 


< arc(Y7 Z)t 


has(Vi Y? Z) )t 


) 


> 7 


Graph )- 

Repetitions in a bag may be undesired. For example, the first call 
above should rather find a set of all vertices—as it stands, Graph will 
contain numerous duplicates. The procedure setof would call bagof and 
then filter the resulting bag. In Prolog-10 and some other implementations 
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both bagof and setof are built-in procedures: setof even returns its output 
sorted. An implementation of setof in Prolog was presented by Pereira 
and Porto (1981), 


4.2.5. Array Analogues in Prolog 

There is no addressing mechanism in Prolog, no memory cells di¬ 
rectly available to the programmer—for most applications this is simply 
unnecessary. Consequently, there are no arrays interpreted as contigu¬ 
ous, addressable areas. From a mathematical standpoint, arrays corre¬ 
spond to finite matrices, i.e. to mappings from finite sets of subscripts to 
sets of values. In theory, there are no restrictions on the form of sub¬ 
scripts, although integers are most commonly used. 

In Prolog we can represent such mappings as procedures consisting of 
unit clauses, one clause for each sequence of subscripts and the corre¬ 
sponding value. This is but a special case of relation in the relational model 
of data (see Section 8.2). 

Unit clauses are particularly convenient as a representation of sparse 
matrices, provided that clause indexing is supported by the Prolog imple¬ 
mentation. 

Another possibility is to represent a mapping as a list of n-tuples 
(subscripts, value), and to use list manipulation procedures. As a special 
case, a sequence subscripted by consecutive integers may be represented 
as a list of values. This approach may work for short lists, but in general it 
is prohibitively inefficient. 

We shall now present an alternative way of storing integer-sub- 
scripted sequences, which is rather unlikely to outperform Prolog data 
bases (with indexing), but may be reasonable for sequences of moderate 
size. The method makes use of digital search trees (see e.g. Sedgewick 
1983). 

Branching in digital search trees is based on the values of successive 
digits of the key being looked for. Keys cannot be negative. The order of 
every node is equal to the base of the digital system, e.g. to 10 if keys are 
expressed in decimal. In Fig. 4.13 we show two trees, each containing 15 
items numbered 0 through 14. A, denotes the i-th item, the root is empty 
(i.e. contains a dummy value), and branches are labelled with digits. To 
find A [ 3 in the binary tree, we take 1101, the binary code of 13, and go 
down selecting branches labelled with 1,1,0 and 1. To find A 13 in the 
ternary tree, we use 111 , the ternary code of 13. 

Digital search trees are best implemented in Prolog by open trees. We 
shall demonstrate it in the case of ternary trees (other cases are basically 







FIG, 4.B Digital search trees: (a) Binary digital search tree, (b) Ternary digital search 


tree. 
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identical, although impractical if nodes have more than a few branches). 
A non-empty tree will be represented as 

t3( Value, Left, Middle, Right) 

and an empty tree as a variable. Explicit labels are unnecessary, as we 
may simply select Left or Middle, or Right upon encountering 0, 1 or 2, 
respectively. 

A procedure that finds a value, given a ternary subscript and a tree, is 
quite straightforward. We assume that subscripts are represented as 
dosed lists of digits: 

find_3< O? 13 < Value?* _* _* _ )r Value? 
find_3< CO ! Bub 1 r 13< Left* „ > * Value > 

find^3( Kubf Loft* Value >- 
find-3C Ci * Subl* 13C Middle* „ >* Value ) =- 

firtd_3( Sub* Middle* Value >■ 
find^3< C2 ! Sub1 * 13< -p -* Risht >* Value > 
find_3 < Sub* Risfht* Value > - 

The procedure fails if the first parameter is not a correct ternary subscript , 
or if the second parameter is not an open ternary tree. However, it does 
not fail when a nonexistent item is referred to. We shall discuss this 
phenomenon presently. 

Now for a procedure that replaces an item. Two tree parameters are 
required, and the new tree is a copy of the old one, except for the replaced 
item. The amount of copying is similar to that illustrated in Fig. 4.8, 

d'KinaS-3 ( NeiwOeil r L.f M» R ) > 

13 ( NowVal* L* h* R ) >- 

chanao_3{ CO 1 Sub Up NowValp 13< 01 dVal p Cp M* R >p 

13 ( OldVal * NclwLp Mp R > > s- 
chari9e-3< Subf NewVal* L* NowL ) - 
chansGr_3 ( Cl ! Subl* NowVal * OldValp L _r h* R )p 

13 C OldVal* Lt Nc>wM* R > ) S” 
chaf>se^3< Sub p NewVal p Mp NowM )- 










116 4 Simple Programming Techniques 


chon9<S-3( C2 1 Sutler NeuiUal > t,3< QldValt L.t Hi R )» 

t3< OldOtilf L-» M, NewR ) ) ■- 

chans c>_3 ( Sub> NewOal> ft> NewFi ). 

Both procedures behave in the same way when the subscript is too 
large: they create a missing part of the tree, and then “find” or “change” 
the newly inserted item. For example, the call 

find_3( [ 2, 1, 0, 1 ], Tree, A64 ) 

applied to the tree of Fig. 4.13b changes the node with item A7 into the 
tree of Fig. 4.14, or, in term notation, into 

t3( A7, t3( Dummy21, Empty_i, 

t3( A64, Empty_ii, Empty_iii, Empty_iv ), 
Empty_v), 

Empty, vi, Empty.vii) 

The same effect will be achieved by the call 

change_3( [ 2, 1, 0, 1 ], A64, Tree, Tree ) 

The moral is that, first, no special insertion procedure is needed, and, 
second, the tree need not be full. It will contain only the inserted nodes 
together with the branches required to reach these nodes, but intermedi¬ 
ate nodes may contain no meaningful information. 

To make the story complete, here is a little procedure that converts 
nonnegative integers into lists of ternary digits. Note that there are two 
procedures here: conv_3/2 and (auxiliary) conv_3/3. 

conv-3< 0 1 CO 3 )* 

corvv-3 C Np TerM > 8— integer < N ) r O < Nr ccmv_3< N? C3f TerN > * 


conv_3( Of AllDibits r AllDigits ) fr- !- 
conv_3( Nf Zf All Digits ) s- 

Digit is N wiod 3 f Nby3 is H / 
corcv_3( Ht>g3F CDigit * 23* All Digits ). 

4.2.6. Access to the Structure of Terms 

In Section 4.2.1 we dismissed the possibility of representing tree 
nodes with main functors: the term 

few( nil, people( many( languages, nil ), speak ) ) 
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PIG. 4.14 Creation of the missing part of a tree. 

would stand for the BST of Fig. 4.1. We shall now show an insertion 
routine for such trees. The built-in procedure = .. (univ) is used to circum¬ 
vent the problem raised by the potential diversity of the functors. 

insert < Node* Tree* NewTree > s- 

Tree =*• LRoot* Left* Rights* 

insert ( Node* Root* Left* Risht* NewLeft* NewRisht )* 
NewTree -.. tRoot? NewLeft* NewRiahtJ* 
insert < Node* nil* Node 
insert C Node* Leaf* NewTree ) =- 

insert( Node* Leaf* nil* nil* Left* Rieht )* 

NewTree »«* CLeaf* Left* Rishtl, 

insertC Node* Root* L* R* NewL* R > s_ 

precedes( Node* Root >* insert( Node* L* NewL )• 
insert < Node* Root* L* R* L* Ng»wR > 8- 

precedes< Root* Node >* insert C Node* R* NewR ). 

This application of univ is far from typical. As a more realistic exam¬ 
ple, consider the problem of translating an arithmetic expression into 
reverse Polish form, e.g. 

y * sqrt( sqr( x ) + f{ 1, y )) 


into 


[ y, x, sqr, 1, y, f, ’ + *, sqrt, V ] 
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Here is a possible solution: 

revpol ( Ei^r j RevExpr ) 3- 

Ekf-t = ♦ ♦ EFuri I ATSfc3i revar^st Arsis* C 3 1 RevAry*, 
appendC RevArysr EFunJr RevExpr }* 

revary5( tlr RevAlI* RevA11 >* 

revar^sC CArd I Ard^3f RevTiilNuw* RevAll ) J- 
revpoH A T'si i RevArtf >t 

append< RevTillNour RevArsi RevOneMore >f 
revardsl A rat* RevOneMore* RevA11 )* 


We could use difference lists to decrease the cost of multiple append¬ 
ings, but the procedures would become even less readable (but try it—this 
would be an application of the “flatten” schema, although a little un¬ 
wieldy because of the unknown number of arguments). However, a read¬ 
able version would not only be much longer, but also less flexible: 

revpol( A + B, RevExpr ) :- 

revpol( A, RevA ), revpol{ B, RevB ), 

append{ RevA, RevB, Aux ), appendf Aux, [ ’ + ’ ), RevExpr ). 


revpol( sin( A ), RevExpr ) 
revpoK A, RevA ), 
append( RevA, [ sin ], RevExpr ). 


revpol( Atom, [ Atom ]). 

This is a closed schema: to be able to recognize a new function or opera¬ 
tor, we must add a branch to this "case statement”. 

Perhaps one of the most important applications of univ (and related 
built-in procedures) is in bootstrapped implementations of Prolog. A basic 
interpreter (see Chapter 6 and Section 7.3) may support Prolog (with a 
very rudimentary syntax) furnished with built-in procedures analogous to 
call and univ. Various user interfaces can then be written in this simplified 
Prolog (see Section 7.4), provided we can convert texts to terms. 
Assume we input the text 

foo( fie( X ), ok, X ) 
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and produce its (intermediate) representation: 

[ I f, o, o ], [ [ f, i, e ] t V ], [ [ o, k ] ], V ] 

with uninstantiated V. (Try to write this reading program: a symbol table 
such as those described in Section 4.2.2 must be used to handle variable 
names properly.) Now we can glue the intermediate representation to¬ 
gether: 

gluc>( Inter* Inter > s- var ( Inter )* *- 

CFunChars ! Inter Arssjf Terto ) :- 

not all dibits< FunChars >r 
sLuear£js< InterArssr Arss >* 

pnawit! C Fun* FunChars >* Ter m ==*. CFun I ArsfsU. 

3lue< LDisitsJ* Number ) s— 
cilldisits( Dibits >* 
pnameiC Number* Visits >. 


sluearsstt O* 12 >■ 

sluearssC tin ter Are ! InterArssl* E'Arg l Ar^sl ) s- 

slue( InterArs* Ars > * 3luearss( InterAras* Ar^s 


alldisits( CDisii ! Dibits! ? =-■ 

disit( Bisit >* ! * alldisits< Disits >- 

slid isitsC tl )- 

The procedure glue should be called with the second parameter uninstan¬ 
tiated. 

in implementations that do not support indexing (see Section 4.2.4), 
urtiv helps avoid linear search of matching clauses. Consider, for exam¬ 
ple, a natural language application program which maintains a dictionary 
whose entries can look as follows: 

dict( program, noun( inanim ) or verb( intrans ) ). 
dict( modular, adj ). 
dict( an, article( indef) ). 
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Next, assume that each word on input is filtered through this dictionary: 

input_a_word( W, Features ) :- 
read_a_word( W ), 

( dict( W, Features ), !; signal_unknown( W ) ). 

Without indexing, dictionary lookup requires time proportional to the 
number of entries. Access to a procedure, i.e. to its first clause, usually 
requires approximately constant time (some form of hashing is used). We 
can have our dictionary in the form 

program( noun( inanim ) or verb( intrans ) ). 

modular( adj ). 

an( article( indef)). 

and define diet as 

dict( W, Features ) :- 

Entry =.. [ W, Features ], Entry. 

or—equivalently—as 

dict( W, Features ) :- 

functor( Entry, W, I ), Entry, arg( 1, Entry, Features ). 

A particularly simple dictionary is a table of keywords for a scanner 
of, say, Pascal: 

const. type. array. record, 

function. var. begin. do. 

etc. To create the representation of a source program name, we can use 
this procedure: 

key_or_id( Name, keyword( Name )) :- Name, !. 
key_or_id( Name, ident( Name )), 

As a final example, here is the crucial part of a definition of the 
procedure phrase which initiates processing based on a metamorphosis 
grammar: 

phrase( InitialNonterminal, Terminals ) :- 

InitialNonterminal =.. [ Name | Parameters ], 
initialCa.il =.. [ Name, Terminals, (] | Parameters ], 

InitialCall. 

Note that input and output parameters are added at the beginning of the 
parameter list (rather than at the end, as suggested in Section 3.3). 
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4.3. SOME PROGRAMMING HINTS 


We have collected here some down-to-earth suggestions which may 
help improve your coding technique in Prolog. Although style is largely a 
matter of taste, some of the things we have to say have long been present 
in Prolog folklore, and we feel fairly confident they are worth presenting. 


4.3.1. Using the Cut Procedure 

Essentially, the cut commits the currently executing procedure to 
whatever it might have done since its activation. This is precisely what 
makes the cut a controversial feature: that it can only be interpreted 
dynamically. On the other hand, the variety of its uses and its power make 
it an important factor in the emergence of Prolog as a practical program¬ 
ming language. 

In Chapters 1 and 2 we discuss the cut—in a very general manner— 
both as an extralogical mechanism and as a tool for improving efficiency. 
Here, we shall concentrate on its applications. 

Despite Prolog’s inherent nondeterminism, the usual computation is 
mostly deterministic: the majority of procedures are expected to produce 
a single, well-defined response to any particular set of input data. Most 
procedures are strictly deterministic: at most one clause of a procedure 
applies, regardless of the actual data. 

With the procedural interpretation of Prolog in mind, clauses are 
commonly written as 

head :- tests, actions. 

A clause is executed for its actions which can be performed if and only if 
head matches the call and all tests succeed. This conforms to the funda¬ 
mental notion of guarded commands (Dijkstra 1975). Some Prolog dia¬ 
lects, e.g. IC-Prolog (Clark et al. 1982b), even provide special syntax for 
“guards”. 

If, during a deterministic computation, tests have succeeded, a cut 
executed immediately after tests commits our choice of the clause. The 
cut saves us further—unnecessary—attempts to execute the procedure in 
the case of a failure later on. As an example of this fairly typical situation, 
consider the following: 

% Retrieve ( fetch ) the grammatical description of a word, 

% fail if there is no such word in the dictionary. 
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% The word may be given as a string: 
find( String, Description ) 

isletterstring{ String ), % yes, a string 

pname( Word, String ), fetch( Word, Description ). 

% or as a word, i,e. nullary functor: 
find( Word, Description ) :- 

isword( Word ), % yes, a word 

fetch( Word, Description ). 

% Reject bad data: 
find( Bad, _ ) 

not isletterstring( Bad ), 

not isword( Bad ), % yes, bad data 

signal( Bad ), fail. 

In this procedure, cuts may be safely placed after tests. Notice, however, 
that a cut inserted earlier changes the procedure’s behaviour, and a cut 
after fetch does not work if a word is absent from the dictionary. (It would 
also have ruinous effects if fetch were a nondeterministic generator of 
synonyms.) 

When we adhere to the "guarded command" style of programming, 
the built-in procedure not is frequently used to invert tests (but see the 
beginning of the next section for a brief discussion of riot’s peculiarities!). 
However, we would not like to perform expensive tests twice, as in this 
example: 

addunique( Item, List) :- 

presentinalonglist( Item, List), signal_dupl( Item ). 
addunique( Item, List ) :- 

not presentinalonglist( Item, List), additem( Item, List). 

We can replace the inverted test with a cut after the original test: 

addunique( Item, List ) :- 

presentinalonglist( Item, List), !, 
signal_dupl( Item ). 

addunique( Item, List) :- % not present...(Item, List) 

additem( Item, List ). 

This procedure can be interpreted as 

if present. ..( Item, List ) then signaLdupK Item ) 

else additem( Item, List ) 

This is, perhaps, the most frequent application of the cut. It must be 
remembered, though, that this use of the cut is extralogical: a clause with 
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a test removed means something else, and it cannot be understood in 
separation from the rest of the procedure, Still, the procedure as a whole 
is usually sufficiently readable, if we view it as a (possibly nested) 

if ... then ... else if ... then etc. 

Sometimes cuts inside a procedure are undesirable. One example is a 
procedure that holds data, e.g. 

father( jack, tom). 
father{ bill, john ). 
etc. 

(with empty tests and actions). With a cut in each clause this would not 
only look ugly, the procedure would be of no use as a generator! Instead, 
we should commit the call on father, as in this procedure: 

is_father( Person )father( Person, _ ), 1, 

The cut serves as a firewall against unwanted backtracking. 

Another example. Consider this group of grammar rules: 

command( Cmd ) -*■ stop( Cmd ), 
command( Cmd ) -» dump( Cmd ). 
command( Cmd ) —*■ load( Cmd ). 
commandf Cmd ) —*• create( Cmd ). 
etc. 

A “switch” such as command is best committed by the cut after a call, 
e.g. 

phrase( command( Cmd ), Tokens ), ! 

This technique, however, has a disadvantage. The “committing” cut 
affects not only the call but also the calling procedure. If the call being 
committed happens to be the last test in a clause, then the cut plays two 
roles at once. Otherwise we should make it invisible to the surrounding 
clause. To achieve this, we can use this general-purpose “call-and-com- 
mit” procedure: 

once( Call) :- Call, !. 

Other arguments against “cutting high” are implementation-depen¬ 
dent. First of all, in many implementations memory requirements are 
smaller when there are fewer fail-points, so it may be desirable to perform 
cuts as soon as possible. Some implementations also optimise storage 
utilisation of tail-recursive procedures (see Section 6.4). A procedure may 
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become tail-recursive dynamically, after having its remaining clauses cut 
off. For example: 

% Recognize a sequence of letters/digits. 
ld( [ Ch | Chs ] )-► [ Ch], {letter! Ch ) }, !, Id( Chs ). 
ld( [ Ch | Chs ]) —> [ Ch ], { digitf Ch )}, !, Id( Chs ). 
ld( [!)-*[]■ 

(Here, the cuts may protect us against deep recursion, effectively chang¬ 
ing it into iteration.) 

Sometimes the use of cuts should be recommended for clarity. We 
shall present two versions of the procedure that translates the term (A|, 
A„) into the list [A|,.... A„] and the term A (other than a comma-term) 
into [A], First the version with “full guards”: 

c_list( A A, [ A A ] ) var( AA ). 
c_list( AA, [ A | As ] ) :- 

not var( AA ), AA = ( A, AATail ), c_list( AATail, As ). 
c_list{ AA, [ AA ] ) 

not var( AA ), not AA = (_,_). 

And the version with cuts (here the order of clauses is crucial): 

c_list( AA, [ AA ] ) :- var( AA ), !. 
c_list( ( A, AATail ), l A | As ] ) 

!, c_list( AATail, As ). 
c_list( A A, [ AA ]). 

In nondeterminisitic procedures cuts should be used cautiously, if we 
do not want to inadvertently lose some solutions. In particular, proce¬ 
dures that compute multiple answers (such as append) should not contain 
cuts. A cut after a call on a generator makes it yield only its first satisfac¬ 
tory answer, as in this small example: 

int( 0 ). 

int( NextN ) :- int( N ), NextN is N + 1. 
int( X ), satisfactory( X ), !. 

Cuts after tests in a procedure written according to the “guarded com¬ 
mand” style implement Dijkstra’s don’t-care nondeterminism of if state¬ 
ments: any—exactly one—of the branches with true guards is chosen (in 
Prolog, the first one). 

Special care must be exercised when adding cuts to procedures in¬ 
tended to be used in more than one way (such as grammar rules intended 
both for analysis and synthesis). 
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4.3.2. Failure as a Programming Tool 

The procedure not, used to invert tests, owes its power and concise¬ 
ness to the combined effect of three extralogical mechanisms in Prolog: 
variable calls, the cut, and forced failure. Recall the definition: 

not X X, !, fail, 
not 

Observe that the second clause performs no instantiations, and any in¬ 
stantiations in X must have been undone on failure. If not succeeds, its 
parameter will remain intact. Therefore, not will not return anything. For 
example, the call 

not student( X ) 

with uninstantiated X will not find a nonstudent (as might have been 
expected). Instead, it will fail if there is at least one student, e.g. 

student( jim ). student( jill). 

Otherwise it will succeed with X still a variable. If we insist on finding 
nonstudents, we can look for them among NewYorkers: 

newyorker( tim ). newyorker( jim ). 
newyorker( jill). newyorker( amy ). 

Now the command 

newyorker( X ), not student( X ), 
write( X ), nl, fail. 

will print: 

tim 

amy 

It must be emphasized that not called with a term containing variables 
does not implement negation properly (see Clark 1978). If the call not 
student(X) succeeds, then we shall actually prove that 

3x student( x ) 
which is equivalent to 

Vx -i student! x ) 

On the other hand, suppose not means The command 
not student! X ). 
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would then be interpreted (see Chapter 2) as 
V x student( x ) 

i.e. as Vx student( x ), Its negation—to be proved by reductio ad absur- 
dum—is 

3x -i student( x ) 

'This discrepancy was commented upon, for example, by Clark and McCabe 
(1980a, 1980b) and Dahl (1980). I niC-Prolog (Clark etal. 1982b) the problem 
was solved by treating not calls with variables as erroneous. This is to say, 
negation in their system is only applicable to ground predicates. 

Except for not, forced failure is used primarily for efficiency. Many 
Prolog implementations have no garbage collection, but upon backtrack¬ 
ing almost all of them very efficiently recover some storage holding con¬ 
trol information and term instances (see Chapter 6). We can take advan¬ 
tage of this in a few rather unobvious but effective tricks. One of them is 
“double not”. 

On the face of it, the trick is pointless: the call 
not not C 

succeeds if and only if C does. We shall trace the execution of this call to 
show its hidden effect. Assume first that C succeeds; here are successive 
snapshots: 

not not C 
not C, !, fail 
C, !, fail, !, fail 
!, fail, !, fail 

% the cut will commit the internal not 
fail, !, fail 

% RECOVER the storage used by C, 

% and backtrack in the external not 
SUCCESS 

Now, let C fail: 

not not C 
not C, !, fail 
C, !, fail, !, fail 

% backtrack in (he internal not, 

% succeed via the second clause 
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!, fail 

% the cut commits the external not 

fail 

FAILURE 

Since “double not" does not instantiate anything, it can only be used 
in two situations. Either we want to perform a complicated “yes/no” test 
(with all interesting variables already instantiated), or we are only inter¬ 
ested in some side-effects of C but we want to recover storage after its 
execution. For readability, we usually define two procedures: 

check( Cond )not not Cond, 

side_effects( Goals)not not Goals. 

One example should suffice: 

prettyprint( Term ) :- side_effects{ doprettyprinting( Term )). 

Suppose now that we need instantiations produced when executing a 
call, and that space still matters. To preserve the results (i.e. the appropri¬ 
ate terms) over backtracking, we must “put them aside”. Only stored 
clauses are immune to failure. The following general-purpose procedure 7 
executes a call, and at the same time “garbage collects” the storage used 
by the call: 

with_gc( Call ) :- 

once( Call ), assert( ’ASIDE’( Call )), fail. 

with_gc( Call ) :- 

rctractf ’AS1DE’( Call ) ). !. % commit retract 

This method makes sense when assert requires less storage than Call, or 
when the implementation has no general garbage collector but reclaims 
storage left by retracted clauses. 

wilh-gc can be employed in loop optimisation, which is an important 
application of forced failure. Essentially, recursion is the most natural 
Prolog counterpart of Pascal-like iteration. Consider a program that takes 
large chunks of an even larger text, extracts some data from them, and 
puts these data into an open tree. The storage for a step is worth recover¬ 
ing. Let step assert basta. after having encountered the final chunk. The 
loop can be written as follows: 

buildtree( _ ) :- retract( basta ), !. % remove the signal 

buildtree( Tree ) :- 

with_gc( step( Tree )), buildtree( Tree ). 

(Find a similar solution for closed trees.) 

7 This technique was advocated by R. A. Kowalski at the Logic Programming Work- 
shop in Debrecen, Hungary, 1980. 
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Suppose now that steps of a loop have no common terms (which 
would have to be passed down the loop). This means that a step is exe¬ 
cuted only for its side-effects. For example, consider the problem of 
reading in a Prolog program up to the clause end.. Let the procedure 
clause-in perform one step: read a clause and assert it (unless it is end. or 
incorrect). The following procedure repeatedly calls on clause-in, and 
recovers storage after each step: 

getprog :* clause_in( Clause ), Clause = end, !. 

getprog :- getprog. 

This loop can be made even more concise if we use a “failure 
screen”. This is a procedure that always succeeds nondeterministically, 
i.e. leaves room for yet another success: 

repeat. 

repeat :- repeat. 

(it is standard in some Prolog implementation). The loop can be expressed 
as 

getprog :- repeat, clause_in( C ), C = end, !. 

After C = end succeeds, the cut will remove the pending choice in repeat, 
and so terminate the loop. 

For this technique to work, the core of the loop must be deterministic, 
as otherwise a failure of C = end would evoke another attempt to execute 
an already executed step. Usually it suffices to enclose the call for a step 
in once(_): 

getprog :- repeat, once( clause_in( C )), C = end, !. 

A special form of forced failure is caused by tagfail 8 . This built-in 
procedure is described in Section 5.12, together with other associated 
procedures. They are all primarily used for error handling, as they allow 
bypassing of large fragments of a computation. Here we shall present an 
application of tag and tagfail for exiting loops. 

An extremely simplified interactive executor of Prolog commands can 
be programmed as follows: 

ear :- tag( loop ). 

ear. 

loop :- repeat, read( C ), once( C ), fail. 

s It is only available in Toy (see Section 5,12), but something similar is present or can 
be programmed in several other implementations of Prolog, 






4,3. Some Programming Hints 129 


The execution of 
tagfail( loop ) 

terminates the loop: tag(loop) fails, and the second clause of ear promptly 
succeeds. With a step defined as 

step read( C ), once( C ). 

and loop redefined as 

loop repeat, tag( step ), fail. 

we can also exit one step by calling 

tagfaiK step ) 

4.3.3. Clauses as Global Data 

The program modification procedures— assert, retract and the like— 
are first of all used to maintain Prolog data bases (see Section 8.2). They 
can also be used in automodifying procedures, those which assert or 
retract their own clauses; this is an extremely dubious programming trick, 
and is not recommended, especially since such programs tend to be rather 
subtly implementation-dependent. 

Modification procedures are also used to store so-called global data. 
In Prolog implementations that do not support modularisation, the data 
kept in program clauses (notably unit clauses) are accessible to all proce¬ 
dures, i.e. global. Such data are significant in Prolog because they are not 
affected by backtracking—see with-gc in the previous section. Also, they 
are sometimes more convenient to handle than information passed around 
via parameters. One example is a “switch”—a parameterless unit clause 
whose presence or absence provides a simple yes/no test. For instance, 
we can supply terse or wordy error messages: 

messagef Code ) :- terse, short-mes( Code ), nl, !. 
message( Code ) :- long_mes( Code ), nl, !. 
short_mes( sym( S ) ) ;- display( *?sym ’ ), display( S ). 


long_mes( sym( S )) :- 

display! ’Unexpected symbol on input: ’ ), 
dispiay( S ), nl, 

display( ’ The remainder of the command will be ignored.’ ). 

A switch can be easily turned on: 

turnonf Switch ) :- Switch, !. % already on 

tumon( Switch ) assert( Switch ). 
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and off: 

turnoff( Switch ) retract( Switch ), L 

% fails if Switch was off 
turnoff( - % already off 

We can also revert the state of a switch (on -* off, off on): 

flip( Switch ) retract( Switch ), !, 
flip( Switch ) assert( Switch ). 

Switches are really cumbersome to program without clausal data. It is 
not difficult to rewrite message : 

message( Code, terse ) short-mes( Code ), nl, !. 
message( Code, wordy ) long_mes( Code ), nl, !. 

but the Terseness parameter ought to be carried everywhere throughout 
the program; and dynamic reversal of a switch can be somewhat messy. 
Our final example demonstrates how assertions can be used to memo¬ 
rize results of expensive computations for future use. Let the procedure 
integrate perform symbolic integration of a given formula (and fail if it 
cannot be done)- If we are going to use this procedure frequently, we may 
wish to avoid recomputing integrals. To this end, we should store every 
integral, once computed, and always try to find a ready answer before 
launching actual integration. Here is a possible solution: 

in£egral( Expr, lExpr ) stored^integral( Expr, lExpr ), !< 
integral( Expr, lExpr ) 

integratef Expr, lExpr ), 

assert( stored_integral( Expr, lExpr ) ). 

In fact, we have thus furnished our program with a primitive learning 
capacity. 


4,4, EXAMPLES OF PROGRAM DESIGN 

In this section we look at several tiny programming problems and 
their solutions which result from more or less formal analysis. This is not 
a real exercise in derivation of programs from formal specifications (see 
Hogger 1979; Gregory 1980; Burstall and Darlington 1977), This is, at 
best, an illustration of such derivation, not very rigorous and with formu¬ 
lae kept as simple as possible. 

These particular problems present no difficulty to experienced pro- 





4,4, Examples of Program Design 1 3 I 


grammers, who can readily solve them without resorting to sophisticated 
techniques. Simple as they are, they help demonstrate how logic formu¬ 
lae, which lend justification to a program designed in a traditional way, 
can also be viewed as the same program (“modulo” some clean transfor¬ 
mations). Implications of this observation for logic programming are far- 
reaching and largely uninvestigated; see Shapiro (1983a) for fascinating 
examples of Prolog programs which are but a by-product of theoretical 
considerations. 

Some of the procedures discussed below can be bi-directional, but we 
intentionally neglect such possibilities. As an exercise, try to discover 
some of their less obvious applications. 

Formulae will be written according to the conventions of Prolog-10: 
variable names are capitalized, functor names begin with small letters. 

4.4.1. List Reversal 

Let rev(X) denote the reverse of list X, let X with A denote the result 
of attaching A at the end of list X (e.g., [p, q] with r = [p, q, r]). Let X = Y 
mean: X matches Y. 

Assuming that X with A has already been defined, a possible definition 
of rev is: 

(4.1) rev( []) = [] 

(4.2) rev( [ A | Tail ] ) = rev( Tail) with A 

Now, recall that in Prolog we can comfortably express relations such as 
“the reversal of X is Y” (which implicitly defines Y as rev(X)) without 
resorting to the notion of equality. To re-express (4.2) accordingly, we 
begin with the introduction of a new variable to denote rev(Tail): 

(4.3) T - rev( Tail ) =£> rev( [ A | Tail ] ) = T with A 

This formula is equivalent to (4.2). Another new variable will denote T 
with A: 

(4.4) T = rev( Tail ) ( TA = T with A ^ 

rev( [ A | Tail ] ) = TA ) 

which is equivalent to 

(4.5) ( T = rev( Tail ) A TA = T with A ) => 

rev{ [ A | Tail ] ) = TA 

We shall rewrite this implication, and the formula (4.1), using re- 
verse(X, Y ) instead of rev(X) = Y, and attachiX, V, Z) instead of Z = X 
with Y: 
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(4.6) reverse( Tail, T ) A attach( T, A, TA ) => 

reverse( [ A | Tail ], TA ) 

(4.7) reverse( [], [] ) 

These two formulae are exactly the logical interpretation of the following 
procedure: 

reverse( [ A J Tail ], TA ) 

reverse( Tail, T ), attache T, A, TA ). 
reverse( [], []). 

The procedure attach can be derived in a similar way: 

(4.8) [] with A = [ A ] 

(4.9) t B | Tail 1 with A = [ B | ( Tail with A ) ] 

From (4.9) we can obtain 

(4.10) TA = Tail with A => [ B | Tail ] with A = [ B | TA ] 
and this (together with (4.8)) is rewritten as 

(4.11) attach( Tail, A, TA ) => attache l B | Tail J, A, [ B | TA ) ) 

(4.12) attach ( (], A, [ A ] ) 

These derivations are by no means unique. Here is another reasoning 
that starts with (4.2). We first introduce TA to denote rev([A | Tail]), and 
get 

(4.13) TA = rev( [ A | Tail ]) => TA = rev( Tail ) with A 
which is equivalent to (4.2). Now we introduce T: 

(4.14) ( TA = rev( [ A | Tail ] ) A T = rev( Tail ) ) => 

TA = T with A 

This is easily translated into Prolog: 
attache T, A, TA ) :- 

reverse( [ A | Tail ], TA ), reverse( Tail, T ). 

In short: we managed to define attach by reverse, but the definition is only 
useful if we can define reverse independently of attach. 

Another method of reversing a list stems from its interpretation as a 
stack (see Section 4.2.1). If we move the items of one stack onto another, 
they will come up in reversed order. Let Stack I and Stack2 denote the 
stacks before this reversal. The final content of the second stack will be 


rev( Stack I ) + + Stack2 
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with X + + Y denoting the result of appending Y to X. The following two 
equalities define rev recursively: 

(4.15) rev( []) ++ Stack2 = Stack2 

(4.16) rev( [ A | Tail ] ) + + Stack2 = ( rev( Tail ) + + l A ] ) + + 

Stack2 

Now, + + is associative, and [A] ++ Stack2 = [A | Stack2], so that we 
can transform (4.16) into 

(4.17) rev( [ A | Tail ]) + + Stack2 = rev( Tail) + + [ A | Stack2 ] 
Next, we introduce a new variable Final: 

(4.18) Final = rev( Tail ) + + [ A | Stack2] => 

Final = rev( [ A | Tail ]) + + Stack2 

Let reverse2(X, Y, Z) denote the formula Z = rev(X) + + Y. From 
(4.15) and (4.18) we get 

(4.19) reverse2( [], Stack2, Stack2 ) 

(4.20) reverse2( Tail, [ A | Stack2 ], Final) => 

reverse2( [ A | Tail ), Stack2, Final) 

(or, accordingly, the same in Prolog). For Y = [], reverse2(X, Y, Z) reads 
Z = rev(X), so to get the reversal of L we must call 

reverse2( L, [], LReversed ) 

—indeed, Stack2 must be initially empty. 

Notice that in going from (4.17) to (4.18) another direction of the 
implication could have been chosen, This choice would lead to the Prolog 
clause 

reverse2a( Tail, [ A | Stack2 ], Final) 
reverse2a( [ A | Tail ], Stack2, Final ). 

which defined the shorter list in terms of the longer. Even though it is 
logically correct, operationally it is unrealistic: neither this nor (4.19) 
would match the initial call with non-empty L. 

We shall conclude this section with an even less formal derivation of 
difference-list reversal. Let the list to be reversed be L -- Z, where L = 
[A].A n I Z]. We can write 

rev( L - Z ) = t A n | X ] - Y 

with X -- Y = rev([A|, .... A n _, | W] - W). Since W is an arbitrary term, 
we can assume W = (A n | Z], so that 

X - Y = rev( L — ( A„ | Z ] ) 








134 4 Simple Programming Techniques 


We now express the longer list by the shorter (see reverse2a above!): 

rev( L [ A n | Z ] ) = X - Y => 

rev( L - Z ) - [ A n | X ] - Y 

and rewrite it in Prolog, with reverse_d(X, Y) instead of rev(X) = Y: 

reversed L - Z, [ An | X ] - Y ) :- 
reverse_d( L — [ An | Z ], X — Y ), 

The base clause, 

reverse_d( Z -- Z, Y -- Y ). 

must come (i.e. be tried) first, because otherwise each call with a variable 
second parameter will fall into infinite recursion (you may wish to check 
this more thoroughly). This is where the peculiarities of Prolog come into 
play, and obscure the so-far clean derivation. It is even worse: we have 
missed one weakness of almost all Prolog implementations: the absence 
of so-called occur check during unification (see Section 1.2.3). Therefore, 
the base clause matches calls with a non-empty list as the first parameter, 
if only the list ends with a variable. For example, the call 

reverse_d( [ a, b j Z ] - Z, Rev ) 

instantiates Z <— [a, b | Z] and Rev «- Y -- Y, contrary to our expecta¬ 
tions. One possible remedy is to instantiate the final variable as [] before 
going on, but to this end both clauses of reverse-d must be duplicated. 
The complete procedure follows. 


reverse_d( ED — CD r Y — Y ) . 


reverse..d < L — Hr 

pevarsd-d( Z — Z* 
r«vwrse_dC L — Z? 


refv(2rse_d < 


CAn ! XD — Y ) 


L — CAn 3 * X — Y 

Y — Y ) - 

EArv ! XI — Y > 

L — CAn * ZD, X 


Y 


> ■ 


Check that even this improved version loops for “negative” lists such as 
[b] -- [a, b]. As you see, difference lists are useful but can be rather tricky. 


4.4.2. Sorting 

We shall derive three procedures to sort a closed list of integers in 
ascending order. The first two implement insertion sort and a very simple 
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transposition sort, a variation of "bubble sort”. Both have running time 
proportional to the square of list length (but both seem passable because 
few Prolog applications require fast sorting procedures). The third proce¬ 
dure is the simplest quicksort, which takes less time but uses more space 
(the same justification applies). 

Let X into Y denote the list that results from inserting the integer X 
into the ordered list Y. For example, 

5 into [ 4, 7, 10 ] = [ 4, 5, 7, 10 ]. 

A possible definition of insertion consists of three formulae: 

(4.21) A into [] = [ A ] 

(4.22) A > B ^ A into [ B | Tail ] = [ B | ( A into Tail ) ] 

(4.23) A=< B => A into [ B | Tail ] = [ A, B | Tail ] 

The formula (4.22) can be rewritten as 

(4.24) A > B A AT = A into Tail A into [ B | Tail ] = [ B | AT ] 

and then we can use insert(X, Y, Z) instead of X into Y = Z to get the 
following procedure: 

insert( A, [], l A ] ). 

insert( A, [ B j Tail ], [ B J AT ] ) :- A > B, insert( A, Tail, AT ). 
insertf A, [ B j Tail ], [ A, B | Tail ] ) :- A =< B. 

Now, let sorted(X) denote the sorted permutation of the list X. We 
can define sorted in the following way: 

(4.25) sorted( [])=[] 

(4.26) sorted( [ A | Tail ] ) = A into sortedf Tail ) 

The latter formula can be replaced by 

(4.27) ST = sorted( Tail ) A AST = A into ST => 

sorted( [ A | Tail ] ) = AST 

To express it in Prolog, we shall rewrite sorted(X) = Y as ins_sort(X, Y), 
and get these two clauses: 

ins_sort( [],[]). 
ins_sort( [ A | Tail ], AST ) 

ins_sort( Tail, ST ), insert( A, ST, AST ). 

The order of calls in the second clause is not accidental: the tests in insert 
require fully instantiated parameters, so the procedure would not work if 
insert came first! 

Transposition sorting results from the observation that a sequence is 
unordered iff it contains an unordered pair of contiguous items (e.g. A, B 
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such that A > B, if we consider the ascending order). Each step of a 
sorting algorithm should increase the “orderedness” of the sequence, e.g, 
by swapping A and B. 

The following formula characterizes this sorting method: 

(4.28) L = X++[A, B|Y]AA>BALT = X++[B, A f Y ] 

=> sorted( L ) = sorted( LT ) 

where X + + Y denotes Y appended to X. We should describe explicitly 
the “less ordered" sequence by the “more ordered”, e.g, thus: 

(4.29) L = X+ +[A, B|Y]AA>BALT = X + + [ B, A | Y ] 

A SL = sorted( LT ) ^ SL = sorted( L ) 

Using trans_sort(X, Y) for sorted(X ) = Y, and append(X, Y, Z) for 
X + + Y = Z, we can rewrite (4.29) into Prolog: 

trans_sort( L, SL ) :- 

append( X, [ A, B | Y ], L ), A > B, 
append( X, [ B, A | Y ], LT), trans_sort( LT, SL ). 

Suppose now that for no X, A, B, Y we have L = X + + (A, B j Y] 
A A > B, i.e, that L is either ordered or too short (and also ordered!). 
More formally: 

(4.30) -n(L = X++[A, B|Y]AA>B)=>L = sorted( L ) 

When we rewrite this in Prolog, we shall drop the premise and place the 
resulting clause after the recursive one. The first two calls in that clause 
can be regarded as tests: does L contain a two-item subsequence, and is 
this subsequence unordered? The clause fails if this is not the case, and 
the premise of (4.30) becomes trivially true. We are left with the clause 

trans_sort( L, L). 

which is exactly the required base clause: we proceed from “less or¬ 
dered” sequences, so that eventually we must get an ordered permu¬ 
tation. 

The arrangement of calls in the first clause is crucial. To begin with, 
we repeatedly isolate any two contiguous items (this fails if the list is too 
short), and we look at their ordering. The first improperly ordered pair 
terminates this process, and we recursively sort the “improved” se¬ 
quence. The procedure is attributed to van Emden (Coelho et al. 1980). 
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The last sorting algorithm we are going to program in Prolog is the 
well-known quicksort (Hoare 1962). For a given sequence L and its ele¬ 
ment A, let small(L, A) denote the subsequence consisting of ail items 
smaller than A, and large(L, A) those larger than A. Items equal to A will 
fall, say, into smalI(L, A). The following formulae describe two possible 
situations: 

(4.31) sorted( [ A | L ]) = sorted( small( L, A )) ++ A + + 

sorted( large( L, A )) 

(4.32) sorted( []) = (] 

The usual transformations of (4.31) give, for example, 

(4.33) small( L, A ) = LAs A largef L, A ) = LAI A 
sorted( LAs ) = SLAs A sorted) LAI ) = SLAI => 

sorted( [ A | L ]) = SLAs + + [ A ] + + SLAI 

When implementing quicksort, a standard practice is to compute 
small(L, A) and large(L, A) simultaneously, i.e. to introduce 

partition( L, A, LAs, LAI ) 

instead of the first two equalities in (4,33). Here is the Prolog code for 
partition (it can be derived in a straightforward way): 

partition( [ X | Tail ], A, [ X | Small ], Large ) :- 
X =< A, partition( Tail, A, Small, Large ). 
partition( [ X J Tail ], A, Small, [ X | Large ]) :- 
X > A, partition( Tail, A, Small, Large ). 
partition) [],_,[], []). 

The formula (4.33) should be transformed in the usual way: 

(4.34) partition) L, A, LAs, LAI) A 

sorted) LAs ) = SLAs A sorted( LAI) = SLAI A 
SLAs ++ [ A | SLAI ] = Sorted => 
sorted( [ A | L ]) = Sorted 

This is directly expressible in Prolog, with quick_sort(X, Y) denoting the 
equality sorted(X) = Y: 

quick_sort( [ A J L ], Sorted ) 
partition( L, A, LAs, LAI ), 
quick_sort( LAs, SLAs ), quick.sort( LAI, SLAI ), 
appendf SLAs, [ A | SLAI ], Sorted ). 
quick_sort( [], [] ). 
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In the worst case, the cost of appending sorted fragments is propor¬ 
tional to n 2 for a list of length n. We can avoid appending altogether 
exactly as we did in reverse2 in the previous section. We take the empty 
stack, sort large(L, A) and push sorted(large(L, A)) onto the stack. Next, 
we stack A, and finally sorted{small(L, A)). 

The formulae corresponding to (4.31), (4.32)—and similar to (4.15), 
(4.16)—are as follows: 

(4.35) sorted( [ A | L ] ) + + Stack2 = 

sorted( smali( L, A))++[A]+ + 
sorted! largef L, A )) ++ Stack2 

(4.36) sorted( []) + + Stack2 = Stack2 

We can now repeat the same reasoning and replace (4.35) with 

(4.37) partition! L, A, LAs, LAI ) A 

sorted! LAs ) + + [ A ] + + sorted( LAI ) + + Stack2 = Sorted 
4* sorted( [ A | L ] ) + + Stack2 - Sorted 

The lefthand side equality in (4.37) must be rewritten as 

(4.38) sorted{ LAI ) ++ Stack2 - LargeStacked A 
sorted( LAs ) + + [ A ] + + LargeStacked = Sorted 

We introduce q_sort(X, Y, Z) for the equality sorted(X) + + Y = Z, and 
get the following procedure: 

q_sort( [ A | L ], Stack2, Sorted ) 
partition( L, A, LAs, LAI ), 
q_sort( LAI, Stack2, LargeStacked ), 
q_sort( LAs, [ A | LargeStacked ], Sorted ). 
q_sort( [], Stack2, Stack2 ). 

And wrap it in 

quick_sort_2( List, Sorted ) q_sort( List, [], Sorted ). 

This version of quicksort is also attributed to van Emden (Coelho et 
al. 1980). 

Notice that the recursive calls on qsort can be interchanged. A 
partly uninstantiated stack will be appended to sorted(small(L, A)); the 
other call will then fully instantiate the stack. Actually, the pairs Sorted, 
[A | LargeStacked] and LargeStacked, Stack2 can be interpreted as dif¬ 
ference lists. Try to derive more formally a version of quicksort with 
difference lists. 
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4,4,3. Euler Paths 9 

We shall try to solve in Prolog the problem of finding Euler paths in an 
undirected graph. For the sake of completeness, here are the basic defini¬ 
tions, An undirected graph is the pair (V, with V a finite set of vertices 
and % a set of edges, A vertex is labelled with a unique name. An edge is 
an unordered pair of different vertices, usually interpreted as a connec¬ 
tion between them. A graph is often modelled by a drawing with a point 
for each vertex and a line (connecting the two vertices) for each edge. 
Figure 4,15 shows a graph which consists of five vertices and eight edges. 
A path from vertex X to vertex Y is a sequence of edges such that 
contiguous edges share a vertex, X belongs to the first edge, Y to the last. 
For example, 

( a, b ), ( b, c ), ( c, d ), ( d, e ) 

is a path from a to e in the graph of Fig, 4.15. The same path can be 
unambiguously represented as a sequence of vertices: 

abode 

An Euler path is a path passing through all vertices, in which every 
edge occurs exactly once. An Euler graph is a graph that contains an 
Euler path. For our graph, 

dbacbecde 


g The problem (described as "drawing a picture") was solved in Prolog by Szeredi 
(1977), 
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is an example of Euler path, if we remove the edge be, the resulting graph 
will not be an Euler graph (you may wish to check this). 

We shall develop a very simple program, depending only on the most 
intuitive properties, which looks somewhat blindly for Euler paths. A 
more efficient algorithm arises from a theorem that characterizes Euler 
graphs. We shall quote the theorem at the end of this section. 

At first, we must choose a method of representing graphs. We can 
assume that the graph contains no isolated vertices (vertices which do not 
belong to any edge); otherwise, it is certainly not an Euler graph. A graph 
without isolated vertices can be represented by its set of edges alone. An 
edge is an unordered pair, i.e. a set of two vertices. Since we have no sets 
in Prolog (as in most programming languages), we shall represent edges 
with ordered pairs: 

VI <-» V2 

(«-* is a non-associative infix functor), and we shall try to make the pro¬ 
gram account for the cummutativity of pairs. 

Euler graphs have the following two properties: 

!. A graph with one edge is an Euler graph. 

2. Suppose we take out an edge, and what remains is a Euler graph with 
an Euler path starting with one of this edge’s vertices; then the whole 
graph is an Euler graph (and we happened to have removed a terminal 
edge of an Euler path). 

Let paths be represented with lists of vertex names, and let path(E,P) 
mean ”E is the set of edges of an Euler graph, and P is an Euler path in 
this graph”. The first property above can be rewritten as two formulae: 

(4.39) path( { VI -m* V2 }, [ VI, V2 ] ) 

(4.40) path( { V2 VI }, [ VI, V2 ] ) 

In other words, both arrangements of vertices are equally satisfactory. 

Here is how the second property can be formalized (\ denotes set 
subtraction): 

(4.41) path( E\{ VI V2 }, [ V2 | RestofPath ]) => 

path( E, [ VI, V2 | RestofPath ] ) 

(4.42) path( E\{ V2 VI }, [ V2 | RestofPath ] ) => 

path( E, [ VI, V2 | RestofPath ] ) 

Before we rewrite (4.39-4.42) into Prolog, we must finally decide how 
to represent sets. We can use any structure capable of holding uniform 
data; to keep things simple we shall use lists. (Another possibility would 
be to represent each edge as a separate clause, but then we would have no 
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easy way of passing a set of edges as a parameter to a path-finding proce¬ 
dure.) 

The formulae (4.41) and (4.42) must be transformed to get rid of the 
complicated expression inside path ; for example, (4.41) becomes 

(4.43) El = E \ { VI ** V2 } A path( El, [ V2 | RestofPath ]) => 
path( E, [ VI, V2 | RestofPath ] ) 

With the set {VI «-» V2} represented as the list IV1 <-> V2], and 
with takeout(X , Y, Z) denoting the equality X \ {Y} = Z, we can write 
down the procedure path: 

path( l VI <—> V2 ], [ VI, V2 ) ). 

path( [ V2 <-> VI ], [ VI, V2 j ). 

path( E, [ VI. V2 | RestofPath J ) 

takeout( E, VI <-> V2, El ), path( El, [ V2 | RestofPath ] ). 

path( E, [ VI, V2 | RestofPath ] ) :- 

takeouK E, V2 <-> VI, El ), path( El, [ V2 | RestofPath ] ). 

Notice that details of set representation are transparent to the recursive 
clauses. 

A list version of takeout can be defined in a straightforward manner, 
so we shall skip a detailed derivation: 

takeout( [ VI <-> V2 | El ], VI <-> V2, El ). 

takeout( [ Edge | Edges ], TheEdge, { Edge | Remainder ]) :- 
takeouK Edges, TheEdge, Remainder ). 

This program is crying out for optimisation: in the worst cases we can 
traverse the list E twice before locating the edge to be taken out. One 
solution, actually presented in Szeredi (1977), is to make takeout, rather 
than path, sensitive to the order of vertices. This can be easily achieved 
by adding another base clause to takeout: 

takeout( [ V2 <—> VI | El ], VI <—> V2, El ). 

and deleting any one of the two recursive clauses of path. 

The procedure path can be used non-deterministically, to produce all 
Euler paths in a given graph, or with a cut, to check whether the graph is 
an Euler graph (and find an instance of Euler path), It can also be used the 
other way round: given a path it computes a list that represents the Euler 
graph with this path (or all such lists, but this would be overzealous). 

We shall need a few more definitions to formulate Euler’s fundamen¬ 
tal theorem on Euler graphs. A graph is connected if for each two vertices 
VI, V2 there is a path from VI to V2. For example, the graph of Fig. 4.15 
is connected. The degree of a vertex is the number of edges which contain 
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the vertex. For example, b in our graph is a vertex of degree 4, and e of 
degree 3. 

The theorem states that a graph is an Euler graph if and only if it 
is connected and contains either no vertices of an odd degree, or exactly 
two such vertices. In the latter case, the two odd-degree vertices are 
terminal vertices of each Euler path. In the former case, each Euler path 
is a cycle, i.e. a path that returns to the starting point. In our example, d 
and e are the only vertices of odd degree. 

If the graph is known to be an Euler graph, an Euler path can be found 
in time proportional to the number of edges. Once removed, the edge can 
be attached to the path for good. You may find it amusing to modify the 
above program in this direction. 





5 SUMMARY OF SYNTAX 

AND BUILT-IN PROCEDURES 


This chapter describes Prolog as defined by Toy, the implementation 
presented in Chapter 7. The supported dialect is very similar to Prolog-10 
(Pereira et at. 1978, Bowen 1981, Clocksin and Mcllish 1981); some, but 
not all, differences are noted. Other “standard” versions will be similar: 
use the appropriate reference manuals. 

The user communicates with Toy through an interactive interface 
(see Sections 1.2.2 and 7.4.2). 


5.1. PROLOG SYNTAX 


A program can be regarded (roughly) as a sequence of clauses. Defini¬ 
tions and grammar rules in the sequence are grouped in procedures. There 
are quite a few principles that govern “consulting’7“reconsulting”, and 
dynamically asserting/retracting clauses (with the redefinition switch on 
or off). Therefore a formal definition of procedures would be unnecessar¬ 
ily involved: it should account for the fact that procedures change in time, 

The notation used is extended BNF. Non-terminal symbols will be 
boldfaced, and some of them subscripted (this is the first extension). A 
BNF rule takes the general form 

Ihsnonterm ::= rhsi j .| rhs„ 

(Ihsnonterm is either rhsi or ... or rhs n ). 

Each rhs is a sequence of non-terminals and terminals. The second 
extension: zero or more occurrences of a sequence s are denoted as {s}. 
To avoid confusion, terminal symbols j {} will be boldface. Throughout 
the description, we assume standard operator declarations are in force. 

Many special forms, such as integer expressions to be evaluated by is, 

143 












144 5 Summary of Syntax and Built-in Procedures 


= : = , etc., or lists of single characters, will not be described. See Section 
5.2 and on for applications of these forms in built-in procedures. 

Some comments in plain English will be interspersed in the BNF 
description. See also the notes at the end of this section. 


clause :: = definition | grammarrule | directive 
definition ::= nonunitclause | unitclause 
nonunitclause head body 
unitclause ::= head 

COMMENT the main functor of head is not a 
binary 

head ::= nonvarint 

body ::= bodyalt { ; bodyalt } 

bodyaltcall {, call} 

call :: - nonvarint J variable | ( body ) 

nonvarint ::= term 

COMMENT not a variable or an integer (a formal 
definition would be straightforward but 
cumbersome) 

grammarrule ::= lhside —» rhside 

COMMENT the arrow is written as --> 

lhside :: = nonterminal context j nonterminal 

nonterminal ::= nonvarint 

context ::= terminals 

rhside ::= alternatives 

alternatives ::= alternative { ; alternative } 

alternative ::= nileitem { , ruleitem } 

ruleitem ::= nonterminal | terminals | 

condition | ! j ( alternatives ) 
terminals ;:= list | string 

COMMENT only closed lists are allowed 

condition ::= curlyterm 
directive :: = command | query 
command :;= body 
query ::= body 

COMMENT body’s main functor is not a unary 

term ::= term^oo 

term N ::= op f *,N terai N _j | op fy , N term N | 
term N _i op X f,N | termN op y r,N | 
term N _i op X f*.N term N _ t | 
terms-i opsfy.N termN I 

term N op yfx N term N -i | term N -i 
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COMMENT 1 =< N =< 1200; op Tv pe,N is an operator of 
type Type and priority N; term N can be 
called “term with priority N“ 

term, ::= variable | integer | string | 
list | noop | 

noop( term{ , term }) | 

( term ) | curlyterm 
curly term ::= {term } 
noop ::= functor 
opr.N "= functor 

COMMENT T is one of fx, fy, xf, yf, xfx, xfy, 
yfx, N is in the range t.. 1200; see 
also note 1 

list ::= [] | [ term, w { , term^ } 1 ! 

{ term to,, { , term<w } | term ] 

COMMENT terms with priority 999 can be safely 
conjoined by commas which are infix 
functors with priority 1000 

functor ::= word | qname j 

symbol | solochar 
word :: = wordstart { alphanum } 
wordstart ::= smalletter 
alphanum :: = smalletter | bigletter | 
digit I _ 
qname ::= ’{ qitem }* 
qitem ::= ” | nonquote 

COMMENT nonquote is any character other than ’ 
symbol :: = symch { symch } 
variable :: = varstart { alphanum } 
varstart ::= bigletter | _ 
integer - digit { digit} | digit { digit } 
string ::= ”{ sitem }” 

COMMENT in Toy a siring is equivalent to a list of 
character names; in Prolog-10, to a list 
of their ASCII codes 

sitem ::= ”” | nondquote 

COMMENT nondquote is any character other than ” 
smalletter ::=a|b|c|d|e|f|g|h|i| 
j|k|l|m|n|o|p|q|r| 
s|t|u|v|w|x|y|z 
bigletter ::=A|B|C|D|E|F|G|H|I| 
J|KjL|M|N|0|P|Q|R| 

S | T j U | V | W | X | Y | Z 
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digit 0 | 1 | 2 | 3 | 4 | 5 | 6 |7| 8 | 9 
symch :: = - | : | - | < | = | > | + I /1 
*|?|&|$|@|#|-’|\ 

COMMENT a lone dot followed by white space is not a symch 
but a fullstop 
solochar ::= , | ; ( ! 
token ::= functor | variable | integer | 
string | bracketbar 

COMMENT tokens are listed to explain note 6 below 
bracketbar ( | ) | [ | ] | { | } | | 
comment ::= % { nonlinecnd } lineend 

COMMENT lineend is an end-of-line ( linefeed ) 
character; nonlineend is any other 
character. Toy converts line-ends to 
single linefeeds 

whitespace { layoutchar } 

COMMENT layoutchar is blank or tab or lineend 
or any nonprintable character ( in 
ASCII these are characters with codes 
= < 31 ) 

fullstop , layoutchar 

Notes: 

1. Mixed functors have not been described, but their inclusion is straight¬ 
forward: 

term N ::= opi x f y ,f*] ?N term^-] 

and 11 other combinations. In Toy, a mixed functor can only have one 
binary and one unary type, both with the same priority. 

2. There are numerous ambiguous combinations of contiguous operators. 
This grammar does not account for them. See Section 7,4.3 (and Ap¬ 
pendix A.3) for a rather detailed description in Prolog. 

3. Not all functors can be declared as operators. Quoted names are al¬ 
ways taken as “normal" functors. 

4. In the definition of body, commas and semicolons need not have been 
actually singled out, because they are regular infix functors. The defini¬ 
tion 


body :: = nonvarint 


would not, however, emphasize the most common structure of body. 
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5. The syntax of directives conforms to the convention adopted in Toy. 
See Section 7.4.2 for details. 

6. Comments and whitespace can be freely inserted before and after a 
token, and cannot be inserted in the middle of a token. Remember that 
a comment extends till end-of-line, 

7. Whitespace must be inserted between an unsigned integer and a minus 
which is to be treated as a functor. A minus immediately preceding a 
sequence of digits is taken as a part of the integer. 

8. If curly brackets are not available, the usual practice is to use "deco¬ 
rated brackets”: %( and %). This requires some care in the treatment 
of comments. 

9. A term on input must be terminated with a full stop not embedded in a 
quoted name, string or comment. 


5.2. BUILT-IN PROCEDURES: 
GENERAL INFORMATION 


For the purposes of this chapter, built-in procedures fall into two 
groups. System procedures are implemented in the interpreter described 
in Section 7.3. Predefined procedures are written in Prolog; they belong to 
the user interface described in Section 7.4. Together, these two groups 
cover the basic set of Prolog-10 procedures. Differences and extensions 
are noted where appropriate but this is a description of Toy and is not 
intended as a replacement for the Prolog-10 manual. The procedures are 
roughly classified according to their purpose. 

A system procedure call may fail, succeed or raise an error. Failure or 
success is equivalent to a failure or success of a normal procedure call. 
The only difference is that success is usually accompanied by a side- 
effect, such as writing a character, setting a switch, etc. A failing system 
procedure does not usually cause any side-effects (input procedures are a 
notable exception). 

An error is raised when a system procedure detects an incorrect 
parameter (or parameters). If the description of a procedure mentions the 
form of expected parameters, parameters of unlisted forms will cause an 
error to be raised. There is no guarantee that the error will be raised 
before any actions are performed, though this is usually so. 

Raising an error consists in invoking procedure error! I, with its single 
parameter instantiated to the offending system procedure call. In general, 
error behaves as if its call were present in the program instead of the 
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erroneous system procedure call. An explicit call to error is also possible. 
error is a Prolog procedure: the standard library contains a simple version 
which outputs a message and fails. The user can augment this procedure 
to his liking, possibly providing different clauses as “error handlers” for 
different system procedures. Redefinition of error requires removing it 
from the standard library (see Section 7.4.5)—in the present version it is 
protected together with the whole library. Some predefined procedures 
invoke error, and so can the user’s programs. 

error is not in Prolog-10. 

The following are conventions observed throughout this chapter. (Ad¬ 
ditional conventions or explanations appear under some group headings.) 

Whenever we say that a procedure “tries to unify” we mean that it 
fails or succeeds depending on the outcome. Success means that unifica¬ 
tion is performed. 

When we say that a procedure “tests” something, we mean that it 
fails or succeeds according to the result. 

Acceptable parameters are indicated by conventional names listed 
below: 

TERM—any term will do 
INTEGER—an integer 
VAR—a variable 

NONVARINT—a non-variable, non-integer term 
CALL—same as NONVARINT 
ATOM—a NONVARINT without arguments 
NAME—same as ATOM 

CHAR—a NAME consisting of a single character 
FILENAME—a NAME conforming to the implementation-dependent 
conventions for specifying files 
CALLIST—a list (possibly empty) of CALLs 
CHARLIST—a list (possibly empty) of CHARs 
DIGITLIST—a CHARLIST built of digit characters 

In descriptions, PARI, PAR2 etc. stand for actual parameters in the built- 
in procedure call. 

Note that ’123’ is a name, and 123 an integer. 9 is the integer nine, and 
’9’ is the digit (character). The output procedures do not always distin¬ 
guish between the two {writeq does). 

Toy introduces a number of predefined operators. Some of them are 
used as infix or prefix procedure names. Table 5.1 is the list of predefined 
operators: 
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TABLE 5.1 


Predefined Operators 


Name 

Type 

Priority 


xfx 

1200 


fx 

1200 

—> 

xfx 

1200 

; 

xfy 

1100 

, 

xfy 

1000 

not 

fy 

900 

- 

xfx 

700 

is 

xfx 

700 

=:= 

xfx 

700 

=\= 

xfx 

700 

< 

xfx 

700 

-< 

xfx 

700 

> 

xfx 

700 

>= 

xfx 

700 

@< 

xfx 

700 

@=*< 

xfx 

700 

@> 

xfx 

700 

@>= 

xfx 

700 

= = 

xfx 

700 

\= = 

xfx 

700 


xfx 

700 

+ 

yfx 

500 

+ 

fx 

500 

- 

yfx 

500 

- 

fx 

500 

# 

yfx 

400 

/ 

yfx 

400 

mod 

xfx 

300 


5.3. CONVENIENCE 

true 

always succeeds, 
fail 

always fails, 
not CALL 

the “not” procedure (but see Section 4.3.2!): succeeds only when 
the parameter fails. Defined in Prolog: 

not C :- C, !, fail, 
not _. 
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CALL,CALL 

the “and” procedure: succeeds only when both arguments succeed. 
Defined in Prolog: 

A, B :- A, B. 

See also the description of the cut. 

CALL;CALL 

the “or” procedure: succeeds only if either of the parameters suc¬ 
ceeds. Defined in Prolog: 

A; _ :- A. 

B B. 

See also the description of the cut. 
check(CALL) 

succeeds only when the parameter succeeds, but instantiates no vari¬ 
ables—only side-effects of CALL remain. Defined in Prolog: 

check( Call) :- not not Call. 

Not in Prolog-10. 
side. effects(C ALL) 

exactly equivalent to check(Call), but used when the parameter is to 
be executed for its side-effects rather than to test something. Not in 
Prolog-10. 
once(CALL) 

executes CALL deterministically. Defined in Prolog: 

once( Call) :- Call, !. 

Not in Prolog-10. 


5.4. ARITHMETIC 


In the descriptions, div stands for integer division, and mod for taking 
the remainder of integer division. 

The following are correct invocation patterns for sum /3 (not in Pro¬ 
log-10). 

sum(INTEGER, INTEGER, INTEGER) 
succeeds only if PARI + PAR2 = PAR3 
sum(INTEGER, INTEGER, VAR) 

succeeds after unifying PAR3 with the value of PARI + PAR2 
sumdNTEGER, VAR, INTEGER) 

succeeds after unifying PAR2 with the value of PAR3 - PARI 
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sum(VAR, INTEGER, INTEGER) 

succeeds after unifying PARI with the value of PAR3 - PAR2 

The following are correct invocation patterns for prod /4 (not in Pro¬ 
log-10). 

prod(INTEGER, INTEGER, INTEGER, INTEGER) 
succeeds only if PARI * PAR2 + PAR3 = PAR4 
prod(INTEGER, INTEGER, INTEGER, VAR) 

succeeds after unifying PAR4 with the value of PARI * PAR2 + 
PAR3 

prod(INTEGER, INTEGER, VAR, INTEGER) 

succeeds after unifying PAR3 with the value of PAR4 - PARI * 
PAR2 

prod(INTEGER, VAR, VAR, INTEGER) 

succeeds after unifying PAR2 with the value of PAR4 div PARI and 
PAR3 with the value of PAR4 mod PARI 
prod(VAR, INTEGER, VAR, INTEGER) 

like the previous one, but with PARI and PAR2 exchanged 
proddNTEGER, VAR, INTEGER, INTEGER) 

fails if (PAR4 - PAR3) mod PARI is not zero; otherwise succeeds 
after unifying PAR2 with the value of (PAR4 - PAR3) div PARI 
prod(VAR, INTEGER, INTEGER, INTEGER) 

like the previous one, but with PARI and PAR2 exchanged 

TERM is TERM 

the procedure is assumes PAR2 is an integer expression, i.e. a term 
composed of integers by means of standard arithmetic functors: + 
(binary and unary), - (binary and unary), *, /, mod. The procedure 
fails if PAR2 is not an integer expression. Otherwise it evaluates the 
expression and tries to unify the value with PARI. According to 
Prolog-10 conventions, is can also evaluate a list 

[INTEGER ] 

as this INTEGER; e.g. 55 is [55] succeeds. (This is needed in Prolog- 
10 mainly for evaluating single character strings to ASCII codes.) 
Defined in Prolog. 


5.5. COMPARING INTEGERS AND NAMES 

less(INTEGER, INTEGER) 

succeeds only if PARI < PAR2. Not in Prolog-10. 
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TERM =:= TERM 

PARI and PAR2 are treated as integer expressions and evaluated. 
The procedure succeeds only if both parameters are proper integer 
expressions (see /s/2) and their values are equal. Defined in Prolog. 
TERM =\= TERM 

as above, but tests whether the values are nonequal 
TERM < TERM 

as above, but tests whether the value of PARI is less than that of 
PAR2 

TERM =< TERM 

as above, but tests whether the value of PARI is not greater than that 
of PAR2 
TERM > TERM 

as above, but tests whether the value of PARI is greater than that of 
PAR2 

TERM >= TERM 

as above, but tests whether the value of PARI is not less than that of 
PAR2 

NAME @< NAME 

succeeds only when PARI precedes PAR2 in the lexicographic order 
(as defined by the underlying ASCII collating sequence). 

NAME @=< NAME 

like @<, but tests whether PAR2 does not precede PARI. Defined in 
Prolog. 

NAME @> NAME 

like @<, but tests whether PAR2 precedes PARI. Defined in Prolog. 
NAME @>= NAME 

like @<, but tests whether PARI does not precede PAR2. Defined in 
Prolog. 


5.6. TESTING TERM EQUALITY 

TERM = TERM 

tries to unify PARI and PAR2, Defined in Prolog: 

X = X. 

eqvar(VAR, VAR) 

succeeds only when the parameters are two occurrences of the same 
nondummy variable. Not in Prolog-10. 
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TERM == TERM 

succeeds only when the parameters are two occurrences of the same 
term. For example, if A, B are uninstantiated, 

p( A ) = = p( B ) 

fails, even though 

P( A ) = p( B ) 

succeeds. Defined in Prolog. 

TERM \== TERM 

succeeds only when the parameters are not two occurrences of the 
same term. Defined in Prolog. 


5,7. INPUT/OUTPUT 

5.7.1. Switching Streams 

This set of procedures can be used to dynamically change the files 
read or written by the input/output procedures. The user’s terminal is 
treated like any other file: its name is user (both for input and output); the 
terminal is read from and written on by default. 

Ideally, one should be able to open a file with tell or see, stop using it 
with another tell or see, start using it from the current position after a 
second tell or see, and close it with told or seen. There should be no limits 
on the interleaving introduced by using a file in the middle of using a file in 
the middle etc. 

The procedures are described as if this situation were real. In prac¬ 
tice, things are very implementation-dependent. The version of Toy pre¬ 
sented in Chapter 7 has only two input and two output streams: one for the 
terminal and one for a disk file in each direction. Also, Toy has no code for 
dealing with incorrect file names, nonexistent files and the like. All this is too 
dependent on the environment in which it is implemented. 

see(FILENAME) 

the specified file becomes the current input file; the terminal’s name 
is user 

seeing(TERM) 

tries to unify the parameter with the name of the current input file 
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seen 

closes the current input file; user becomes current. Has no effect if 
the current file is user 
tell(FlLENAME) 

the specified file becomes the current output file; the terminal’s name 
is user 

telling(TERM) 

tries to unify the parameter with the name of the current output file 

told 

closes the current output file; user becomes current. Has no effect if 
the current file is user 

5.7.2. Listing Control 

The Toy-Prolog interpreter contains a listing switch. If the switch is 
on, each line read in from the current input is listed on the user’s terminal: 
this is useful when one wants to see what is being read from a disk file. 

echo 

succeeds after turning the listing switch on; has no effect if the switch 
is already on. Not in Prolog-10, 
noecho 

succeeds after turning the listing switch off; has no effect if the 
switch is already off. Not in Prolog-10. 

5.7.3. Terms 
display(TERM) 

writes the term onto the current output. The term is written in stand¬ 
ard notation (prefix with parentheses) and identifiers are not quoted 
even if they normally should be. Variables are written as _n, where 
n is an address. There is no guarantee that a variable will be printed 
as the same address in different invocations of display. In Prolog-10, 
display is a little different: it always writes on the user’s terminal. 
write(TERM) 

writes the term onto the current output. The term is written accord¬ 
ing to operator declarations currently in force. No identifiers are 
quoted. Variables are written as XI, X2 etc. Each invocation of write 
begins numbering from I, so that e.g. the calls 

writef X ), write( f ( Y, X ) ) 

will produce 

Xlf( XI, X2 ) 
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Defined in Prolog. CAUTION: in Toy, write uses numbervars (see 
Section 5.15) which binds variables in the term to ’V’(N) for N = I, 
2, etc. Hence, write cannot output any term ’V'(INTEGER) prop¬ 
erly. 

writeq(TERM) 

same as write, but quotes identifiers that are not proper words or 
symbols, and also those identifiers that coincide with operator 
names; e.g. a 3-paramcter is would be quoted. However, a quote 
within a quoted name will not be doubled (this is a bug, actually). 
Otherwise, a term written by writeq can be read back by read. 
read(TERM) 

reads from the current input a term, terminated with a full stop. 
Succeeds only when PARI unifies with this term. Operator declara¬ 
tions currently in force are taken into account. Recall that a quoted 
name cannot be an operator. If the text on input is not a correct term, 
read prints the message 

+ + + Bad term on input. Text skipped: 

skips and reprints the input until the first (still unprocessed) full stop, 
and tries to unify PARI with ’e r r’. (If the erroneous line does not 
contain a full stop, you should input one before Prolog resumes.) See 
the next section for behaviour on file end detecting. Defined in 
Prolog in terms of single-character input (see the next section). 
opUNTEGER, TERM, ATOM) 

declares an operator with PAR3—the name, PARI—the priority 
(I =< PARI =< 1200, and PAR2—the type. PARI is usually less 
than 1000, to avoid conflicts with clause-constructing operators (see 
the table in Section 5.2); operators with lower priority take prece¬ 
dence over those with a higher priority. PAR2 must be a proper word 
or symbol. Admissible types of operators are fx, fy (unary, prefix); 
xf, yf (unary, postfix); xfx, xfy, yfx (binary, infix). The types fx, xf, 
xfx are non-associative; fy, yf, associative; xfy, right-associative; 
yfx, left-associative. Any other PAR2 causes an error, 
if an operator declaration with this name but another priority is al¬ 
ready in force, the procedure replaces the old declaration with the 
new one. If a declaration with the same name and priority exists, 
three possibilities arise: 


—both operators are binary or both unary; the old definition is re¬ 
placed ; 

—the old operator is unary (binary), the new—binary (unary); a 
mixed functor is declared; 
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—the old operator is mixed, the new—binary (unary); the binary 
(unary) type in the mixed functor declaration is replaced with 
PAR2. 

Defined in Prolog. 
delop(ATOM) 

the operator declaration with the name given by PARI is deleted. The 
name should be quoted to prevent it from being treated as an (errone¬ 
ous) operator with missing arguments. Defined in Prolog. Not in 
Prolog-10, 


5.7.4. Single Characters 

The Toy interpreter contains a single-character input buffer called the 
current character. Initially, it contains a blank and is then refilled by each 
reading operation. In the presented version, each line end is treated as if it 
were a linefeed character (ordinal number 10, see the procedure iseoln). 
Behaviour upon detection of end-of-file depends on the current input. If 
the input is user (i.e. the terminal), Prolog is terminated; otherwise an 
automatic seen is performed and the reading operation is restarted. 

The operations presented here (except nl ) differ from those in Prolog- 
10. In Toy, the arguments of input/output operations are characters, and 
the internal buffer can be used to rescan the current input character. In 
Prolog-10 there is no such buffer and the arguments of the operations are 
integers, i.e. character codes. These operations could be defined as 
follows: 

get0( Ord ) rch, lastch( Ch ), ordchr( Ord, Ch ). 
get( Ord ) rch, skipbl, lastch( Ch ), 
ordchr( Ord, Ch ). 
skip( X ) :- repeat, get0( X ), !. 
put( Ord ) ordchr( Ord, Ch ), wch( Ch ). 

This would not assure complete compatibility, however. Erroneous calls 
would be handled a little differently and so would line ends. See also the 
description of strings. 

rch 

succeeds after filling current character with the next character from 
current input (but see the introductory remarks for effects of line end 
or end-of-file) 
skipbl 

succeeds after ensuring that current character is a printing character 
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with ordinal number greater than 32. Does nothing if it already is such 
a character; otherwise repeatedly invokes rch. 
lastch(TERM) 

tries to unify its parameter with current character 
wch(CHAR) 

writes the character on current output {the linefeed character is inter¬ 
preted as line terminator) 
nl 

terminates the current output line. Defined in Prolog; 
ordchr( 10, Ch ), assert( < nl wch{ Ch ))). 

rdch(TERM) 

gets the next character from current input (by invoking rch). Makes a 
copy of current character, treating a non-printing character (including 
line end) as a blank; tries to unify the copy with its parameter. De¬ 
fined in Prolog. 
rdchsk(TERM) 

same as above, but preceded by a call on skipbl 


5.7.5. Others 

These procedures are not really concerned with input/output, but the 
only effect of status is to write something, and ordchr is most useful when 
reading or writing non-printing characters. They all are not in Prolog-10. 

ordchr(INTEGER, CHAR) 

succeeds only when PARI is the ordinal number (ASCII code) of PAR2 
ordchr(VAR, CHAR) 

succeeds after unifying the variable with the ordinal number of the 
character 

ordchr(INTEGER, VAR) 

succeeds after unifying the variable with the character whose ordinal 
number is the value of PARI mod 128 
iseoln(TERM) 

tries to unify PARI with the end-of-line character. Defined in Prolog: 
ordchr( 10, Ch ), assert( iseoln( Ch ) ). 

status 

writes memory utilisation information on the current output 
See also consult! reconsult! \, listing/0 and listing! 1 in Section 5.11. 
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5.8. TESTING CHARACTERS 


Each of these procedures fails or succeeds depending on whether its 
parameter is a character belonging to a particular class. They are designed 
to help the user interface (see Section 7.4) in reading Prolog terms, but 
some of them are of general utility. The procedures iseoln /1 and ordchr/2 
(see Section 5.7.5) are also used to test characters. 

smalletter(TERM) 

tests whether the parameter is a lower case letter 
bigletter(TERM) 

tests whether the parameter is an upper case letter 
letter(TERM) 

tests whether the parameter is an upper or lower case letter 
digit(TERM) 

tests whether the parameter is a decimal digit 
alphanum(TERM) 

tests whether the parameter is a letter, a digit or an underscore char¬ 
acter 

bracket(TERM) 

tests whether the parameter is one of the following characters: 

()[]{} 

solochar(TERM) 

tests whether the parameter is one of the following characters: 

t 

* i t 

symch(TERM) 

tests whether the parameter is one of the following characters: 

H-*/ = (©#$&:. ?<>-i\ 


5.9. TESTING TYPES 


These procedures fail or succeed depending on the form of their 
arguments. 
var(TERM) 

tests whether the parameter is an uninstantiated variable 
integer(TERM) 

tests whether the parameter is an integer 
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nonvarint(TERM) 

tests whether the parameter is a NONVARINT (neither a variable 
nor an integer); not in Prolog-10 
atom(TERM) 

tests whether the parameter is an atom (a NONVARINT without 
arguments) 


5.10. ACCESSING THE STRUCTURE OF TERMS 


The procedures pname and pnamei are not in Prolog-10. They replace 
namel2, which is similar, but which uses lists of integers (ASCII codes) in 
place of our lists of characters (see Section 5.7.4). 

pname(NAME, TERM) 

builds a list of characters forming the name and tries to unify it with 
PAR2 

pname(VAR, CHARLIST) 

succeeds after unifying the variable with a NAME formed of the 
characters on the list. (Note that pname(X, [1, 2, 3]) binds X to the 
name '123’, and not to the integer 123). 
pnamei(INTEGER, TERM) 

builds a list of decimal digit characters (constituting the written form 
of the integer) and tries to unify it with the term; the integer must not 
be negative. 

pnamei(VAR, DIGITLIST) 

succeeds after unifying the variable with an integer whose written 
form is given by the digit characters on the list. Even when the 
parameters are formally correct, an error may be raised if the speci¬ 
fied integer is too large. 

functor(VAR, INTEGER, 0) 

PAR3 is the integer zero; succeeds after unifying the variable with 
PAR2 (this version is allowed for completeness, see below for sensi¬ 
ble uses of functor) 
functor(VAR, NAME, INTEGER) 

succeeds after unifying the variable with a term whose main functor 
has the name and arity defined by PAR2 and PAR3, and whose argu¬ 
ments are different variables; PAR3 must not be negative 
functor(INTEGER, TERM, TERM) 

tries to unify PAR2 with PARI and PAR3 with the integer zero 
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functor(NONVARINT, TERM, TERM) 

tries to unify PAR2 and PAR3 with the name and arity of the main 
functor in PARI 

arg(INTEGER, NONVARINT, TERM) 

fails if the integer is smaller than I or greater than the arity of the 
main functor in PAR2, Otherwise tries to unify PAR3 with that argu¬ 
ment of PAR2 whose number is given by PARI. 

The following are correct invocation patterns for the procedure = .. 
(pronounced “univ”), which is defined in Prolog. 

VAR =.. [INTEGER] 

succeeds after unifying the variable with the integer 
VAR = .. [NAME | TERM] 

if the term is not a closed list, an error in the procedure length/2 is 
raised ( = .. uses length). Otherwise a term with NAME as its name 
and TERM as its argument list is created and unified with VAR. 
INTEGER =.. TERM 

tries to unify the term with [PARI] 

NONVARINT = .. TERM 

constructs a list, with PARl’s main functor as the head and the list of 
PARI’s arguments as the tail. Tries to unify the list with PAR2. 

5.11. ACCESSING PROCEDURES 


The Toy-Prolog interpreter supports assert/ 3, retract H and clause/5. 
These are low-level, but quite powerful procedures (see the editor in 
Appendix A.4). Parameters representing clause bodies have the form of 
lists of calls (an empty body is []). 

The Prolog library uses these low-level routines to define Prolog-10 
procedures assert!I, asserta/ 1, assertzl 1, retract /I and clause/2. Unlike 
most other built-in procedures, retract/ 1 and clause/2 are non-determinis- 
tic. Parameters representing clause bodies have the form of terms used in 
the external representation, i.e. sequences built with commas (an empty 
body is true). 

An attempt to apply any of these procedures to system routines de¬ 
fined by the interpreter is treated as an erroneous call. So is an attempt to 
modify protected procedures. (There is a diagnostic printout which can¬ 
not be suppressed by redefining error/ 1.) 

Caution: Remember the standard operator declarations listed in Sec¬ 
tion 5.2. To be safe, always enclose a clause-representing term in paren- 
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theses. For example, 
assert( ( a b, c )) 
is okay, but 

assert( a b, c ) 

is a call on assert/ 2. In some versions of Prolog, even 
assert( a b ) 
is incorrect. 

assert(NONVARINT, CALLIST, INTEGER) 

PARI is treated as a clause’s head, PAR2 as its body. The clause is 
asserted immediately after the n-th clause of this procedure (where n 
is PAR3 if a clause with this number exists, and the last clause’s 
position if PAR3 is too large; if the procedure is empty or PAR3 < I, 
the clause is asserted as first). Not in Prolog-10. 
retract(NAME, INTEGER, INTEGER) 

PAR2 must not be negative. PARI and PAR2 define the name and 
arity of a predicate symbol, If the associated procedure does not 
contain a clause whose number is given by PAR3 (the first clause has 
number 1), retract fails. If the clause does exist, it is logically re¬ 
moved from the procedure and retract succeeds. A removed clause 
does not disappear from storage and its active instances can still run 
to completion. Not in Prolog-10. 
clause(NAME, INTEGER, INTEGER, TERM, TERM) 

PAR2 must not be negative. PARI and PAR2 are treated as the name 
and arity of a predicate symbol. If the associated procedure has no 
clause whose number is given by PAR3 (in particular, if it is a system 
routine) then clause fails. Otherwise it tries to unify PAR4 with the 
head of the clause and PAR5 with its body. Not in Prolog-10. 
asserta(NONVARINT) 

treats the parameter as a clause (non-unit if its main functor is :-/2, 
unit otherwise). An error is raised if the first argument of a :- is not a 
NONVARINT. Asserts the clause at the beginning of its procedure, 
creating the procedure if it does not exist. Defined in Prolog. 
assertz(NONVARINT) 

same as above, but the assertion is at the end of the procedure. 
assert(NONVARINT) 

equivalent to asser/«(PARI). 
retract(NONVARINT) 

the parameter is treated as a clause (non-unit if its main functor is :-/2 
and unit otherwise). An error is raised if the first argument ofis not 
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a NONVARINT. The first matching clause is retracted, and a fail 
point created (see Section 5.12). On failure, the next matching clause. 
will be retracted. Note: if PARI has the form nonvarint:-var then it 
matches only clauses with a single call in their bodies. Defined in 
Prolog. 

clausefNONVARINT, TERM) 

tries to locate the first procedure whose head matches PARI and whose 
body matches PAR2; the body of a unit clause is the term true. After 
successful unification, establishes a fail point and succeeds; the next 
matching clause is sought on failure. Defined in Prolog. Not in 
Prolog-10. 
redefine 

this procedure is needed to implement reconsult and should not be 
used directly. It modifies the effects of assert: if the procedure to 
which a clause is added is different from that affected by the last 
assertion, an automatic abolish is invoked before the assert. The next 
invocation of redefine restores the original situation, 
protect 

succeeds after ensuring that all procedures already defined, except 
those whose heads are single characters with no arguments (this 
restriction is imposed by a minor technical difficulty), are protected. 
An attempt to modify a protected procedure (by means of assert, 
retract, abolish, consult, reconsult) is treated as an erroneous invo¬ 
cation of the system procedure in question. (The user interface in 
Toy protects all its procedures.) Not in Prolog-10. 
abolish(NAME, INTEGER) 

PAR2 must not be negative. PARI and PAR2 are treated as the name 
and arity of a predicate symbol. All the clauses of this procedure are 
logically removed (retracted) and abolish succeeds. 
prcdefined(NAME, INTEGER) 

PAR2 must not be negative. PARI and PAR2 are treated as the name 
and arity of a predicate symbol. If the procedure associated with this 
symbol is a system procedure, predefined succeeds; otherwise it 
fails. Not in Prolog-10. 
consult(FILENAME) 

sees the named file and enters program definition mode: successive 
terms are read-in and stored via assertz (see the convention for 
asserta's parameters) and asserted (but see protect). There are two 
exceptions: the term end causes the file to be closed and definition 
mode to be exited; terms with the unary :- as a main functor are 
treated as commands, and immediately executed. Defined in Prolog. 
reconsult(FlLENAME) 

as above, but redefine is called at the beginning and at the end of 
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processing. Contiguous sequences of clauses with the same predicate 
symbol in their heads are treated as complete definitions of proce¬ 
dures and supersede previous definitions. 
listing(NONVARINT) 

PARI must be an ATOM, or a term of the form ATOM/INTEGER or 
a list of such terms (possibly multi-level). Each atom is treated as a 
procedure’s name, each integer as a procedure’s arity. All relevant 
procedures are listed on the current output. Defined in Prolog, 
listing 

as above, but for all defined procedures (including procedures de¬ 
fined in the monitor and library, but excluding built-in system proce¬ 
dures). 


5.12. CONTROL 


Whenever a procedure call activates a clause which is not the last 
clause in its procedure, we say that a fail point is associated with the call. 
A fail point is something to backtrack to: it saves information necessary 
for reestablishing the state of the computation and proceeding with the 
next clause. 

The immediate descendants of a call C are the calls in the procedure 
which C activated. The immediate ancestor of a call C is the call which 
activated the procedure containing C. An ancestor is the immediate an¬ 
cestor or an ancestor of the immediate ancestor. A descendant is defined 
similarly. 

I 

the cut procedure; succeeds after finding the nearest ancestor which 
is not a call! 1, tagil, ,12 or ;/2 and removing all existing fail points 
associated with this ancestor and all its descendants, 
repeat 

an endless "generator of successes” (see Section 4.3.2). Defined in 
Prolog: 

repeat. 

repeat :- repeat. 
call(CALL) 

behaves exactly as if its parameter were in its place, with the excep¬ 
tion that an incorrect parameter (an integer or uninstantiated varia¬ 
ble) is detected at run time rather than at clause-definition time. In 
top-level syntax, one can use a variable instead of a predicate—this 
is converted to an invocation of call. 
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halt(ATOM) 

stops the interpreter after writing the atom. Not in Prolog-10. 

stop 

stops the interpreter. Not in Prolog-10. Defined in Prolog. 

The following procedures are not in Prolog-10. They are useful for error 
handling, but are “dirty”, and should be used sparingly. 

tag(CALL) 

this is a form of call/1 which can be referred to by tagfaill l, tagexit/2, 
tagcut/2 and ancestor12. The parameter of tag is called a "tagged 
ancestor” of its descendants; it is never removed from the stack as a 
result of tail recursion optimisation (see Sections 6.4 and 7.1). 
NOTE: a tag is recognized only when explicitly written in its clause. 
In particular call(tag(C)) is equivalent to call(call(C)). 
ancestor(TERM) 

searches for the nearest tagged ancestor unifiable with the parameter; 
fails if no such ancestor is found, otherwise unifies and succeeds. 
tagcut(TERM) 

searches for the nearest tagged ancestor unifiable with the parameter. 
Fails if no such ancestor is found; otherwise unifies, removes all 
existing fail points associated with the ancestor and its decendants 
and succeeds. 
tagfail(TERM) 
equivalent to 

tagcut( PARI ), fail 

i.e. if the appropriate tagged ancestor is found, the ancestor fails 
immediately; otherwise tagcut fails. 
tagexit(TERM) 

searches for the nearest tagged ancestor unifiable with the parameter; 
fails if no such ancestor is found, otherwise unifies and passes control 
to the ancestor, which succeeds immediately. 


5.13. DEBUGGING 

The built-in debugging facilities of Toy are very primitive, There is 
only a wall-paper trace which displays all calls with a plus or a minus to 
indicate success or failure respectively (e.g. if a call fails to match two 
clauses and activates the third, it is shown twice with a minus and once 
with a plus). 
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A more useful—selective—tracer is listed in Appendix A.5. 

There is also a switch which may cause the interpreter to output 
warning messages upon encountering calls on non-existent procedures. It 
is good practice to turn it on when debugging a program. 

All these procedures are not in Prolog-10, which has a more sophisti¬ 
cated set of debugging aids. 

debug 

succeeds after turning tracing on (no effect if already on) 
nodebug 

succeeds after turning tracing off (no effect if already off) 
nonexistent 

succeeds after turning on warning about calls on nonexistent proce¬ 
dures (no effect if already on) 
nononexi stent 

succeeds after turning off warning about calls on nonexistent proce¬ 
dures (no effect if already off) 


5.14. GRAMMAR PROCESSING 


phrase(CALL, TERM) 

treats CALL as a nonterminal symbol of a grammar rule, schemati¬ 
cally 

nt( ARG1, ARGn ), 

and initiates grammar processing—with this initial symbol—by 
calling 

nt( TERM, [], ARG1, ..., ARGn ) 

Defined in Prolog (see Section 4.2.6). 

5.15. MISCELLANEOUS 
length(NONVARINT, TERM) 

PARI must be a closed list. Computes the length of this list and tries 
to unify the resulting integer with PAR2. Defined in Prolog. 
isclosedlist(TERM) 

succeeds only when the term is a closed list. Defined in Prolog. Not 
in Prolog-10. 
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numbervars(TERM, INTEGER, TERM) 

instantiates PARl’s variables as ’V’(i), ’V’(' + 0.’V’(j) where i, i+1, 

j are consecutive integers, and i is the value of PAR2. Variables bound 
together are of course instantiated as the same ’V’(k). As a result, PARI 
becomes ground (obviously, this is undone upon backtracking). Then the 
procedure tries to unify PAR3 with j +1. For example, the call 
numbervars( [ X, Y, X ], 6, Next) 
where Next is uninstantiated, will instantiate 
X«-*V’(6), Y«-*V’(7), Next -8 

The call 

numbervars( [ X, Y, X ], 6, not_a_number ) 
will fail. 

membeifTERM, TERM) 

establishes the relationship: PARI is a member of the list PAR2. 
Defined in Prolog: 

member( X, [ X j Y ]). 

member( X, [ _ | Y ]):- member( X, Y ). 

bagof(TERM, CALL, TERM) 

tries to unify PAR3 with the list of PARI’s instantiations after all 
possible computations of PAR2 (see Section 4.2.4 for details). Pro¬ 
log-10 has a more sophisticated version of this procedure. Defined in 
Prolog. 






6 PRINCIPLES OF PROLOG 
IMPLEMENTATION 


6.1. INTRODUCTION 


This chapter is but a bird’s-eye view on implementation techniques 
specific to Prolog. We assume you know how conventional block struc¬ 
ture languages are implemented: a competent programmer could hardly 
escape learning these things. The discussion is kept at a level free of 
representation details. Chapter 7 provides a rather detailed and complete 
case study of one of the many ways in which the basic principles can be 
applied in practice. 

Two topics are missing: compilation and garbage collection. To com¬ 
pile Prolog programs is to apply the general principles in such a way that a 
program is executed particularly efficiently. This is done partly by taking 
advantage of the underlying machine (e.g. by using machine code instead 
of a more compact representation of programs, trading speed for memory) 
and partly by performing special case analysis to detect operations which 
can be simplified (e.g. unification with a variable which is known to be 
uninstantiated). We decided that compilation is beyond the scope of this 
book (which already discusses implementation issues more thoroughly 
than the usual introduction to a programming language). The problem and 
techniques of garbage collection are well known, and are best studied 
independently of a particular programming language (though you will find 
that in Prolog one has to do with one of the harder variants of the 
problem). 
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6.2. REPRESENTATION OF TERMS 


If we disregard the possibility of forming cyclic structures (see Sec¬ 
tion 1.2.3), we can see that all terms are directed acyclic graphs (DAGs). 
They are not necessarily trees, because different branches can converge 
to a common component: in linear notation we express this phenomenon 
by repetition, as in t(p(X), q(p(X), Y)). 

In a Prolog program, several identical occurrences of a term within a 
single clause denote the same object. Properly speaking, this is not an 
object but a descriptor, or template. At execution time, it corresponds to 
different objects in different instances of the clause. In this and the next 
chapter we shall reserve the unadorned word “term” for term instances. 
Terms written in a program will be referred to as term descriptions. A 
description can have several occurrences; similarly, an instance can have 
several parents in a DAG. 

There are many possible representations of a DAG. For our present 
purposes they are all equivalent, provided that it is possible to distinguish 
nodes corresponding to Prolog variables. On a more abstract level, how¬ 
ever, two very different methods arc used to implement term instances. 
Accordingly, all existing implementations of Prolog can be classified as 
either Structure Sharing or Non-Structure Sharing (NSS). 

In principle, to form a new term instance in a Non-Structure Sharing 
system, one must create a new DAG. We are talking about creating new 
instances that correspond to term descriptions (present in the program 
text, or in clauses asserted after having been constructed by a program); 
creation of new terms as a result of unification is different. Variables are 
bound by being associated with pointers directed at their instantations. 
These pointers are invisible, i.e. automatically dereferenced, whenever 
the DAG is traversed. Figure 6.1 illustrates—in a representation-indepen- 
dent manner—two terms, before and after unification. 

A Structure Sharing system takes advantage of the fact that different 
instances of the same term differ only in their variable bindings, Whereas 
two instances of 

t< p( X ), q( p( X ), Y )) 
can be 

t( p( c ), q( p( c ), d )) and 

t( p( r( a ) ), q( p( r( a )), r( a ) ) ) 

respectively, their general structure remains the same. The main functor 
must be a t of two arguments; t’s first argument must also be the first 
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FIG. 6.1 The Non-Structure Sharing representation of terms: (a) t(A, q(A, Y)) 
and t(p(X>, B) before unification, (b) t(A, q(A, Y)) and t(p(X), B) instantiated to t(p(X), 
qip(X), Y)) after unification. 


argument of the two-argument q which is t’s second argument; and so on. 
Consequently, all instances of the term may share this structural informa¬ 
tion, if only care is taken to let them have different variables. This is 
easily achieved by associating each instance with a different variable 
frame: a chunk of storage holding variable instances. The internal repre¬ 
sentation of a term description—we shall call it prototype—is a DAG in 
which each variable node is represented by information about the offset of 
the variable’s location in a variable frame. All terms—including variable 
bindings—are now represented not by single pointers, but by two-pointer 
term handies 1 

< prototype, variable frame > 

1 Another terminology, introduced by Warren (1977a). is to call prototypes skeletons, 
and handles molecules. We do not like the mixed metaphor. 
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FIG, 6/2 The Structure Sharing representation of terms: (a) Two instances of t(p(X) t 
q{p(X), Y», both sharing the same prototype, (b) term! instantiated to t(p(c), q(p(c), d)) and 
term2 instantiated to t(p{r(a)), qtptrfa)), r<a))). 
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Figure 6.2 illustrates the principle of Structure Sharing. Figure 6.3 
corresponds to Fig, 6.1. If we find general DAGs less convenient than 
trees, Structure Sharing makes it easy to employ trees by providing im¬ 
plicit links to variables from all occurrence sites. This is shown in Fig. 6.4. 

Inside a clause, different occurrences of the same variable description 
can appear within different term descriptions. There is the problem of 




FIG. 6.3 The Structure Sharing representation of terms; (a) t(A, q(A, Y>) and 
t(p(X), B) before unification, (b) t(A, q(A, Y)) and t(p(X), B) instantiated to t(p(X), 
q(p(X), Y» after unification. 
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(offset O) 


FIG. 6.4 Structure Sharing; the DAG t{p(X), q{p(X), Y» represented by a tree. 

ensuring that the same variable becomes a part of all the corresponding 
terms associated with a clause instance. With Structure Sharing, this is 
done by allocating a single frame for all the variables appearing in a 
clause, at the moment of its activation. All occurrences of a variable 
within this clause’s prototypes are encoded as the same offset in the 
common variable frame (this is just an application of the technique dem¬ 
onstrated in Fig. 6.4). 

In practice, most NSS implementations use a very similar approach 
to solve the problem (despite its name, it is a hybrid method). Term 
instances are also encoded as prototypes, with variables represented by 
offsets into a clause’s variable frame. One difference is that variable 
frame locations hold only single pointers rather than term handles. The 
other—more important—difference is that terms formed in this way are 
only “virtual” instances. This is to say that they may be used only as data 
selectors, directing unification to instantiate variables in the variable 
frame. Whenever one of these terms is to become a variable’s instantia¬ 
tion, a “real” instance (a new DAG) must be built. If this new instance 
contains a variable, its variable node becomes a copy of the appropriate 
location in the variable frame, while the location is made to hold a pointer 
to the node. This ensures that all future references to the variable will end 
up in the node. 

The process is shown in Fig. 6.5. Note that here, too, prototypes can 
















FIG. 6,5 “Virtual" and "real" instances in Non-Structure Sharing: (a) procfUptA), 
p(A), q(p(A),B))) is called with proc{t(p(X), Y, q(Z, r(Y))J)—both terms are “virtual" be¬ 
fore unification, (b) The first occurrence of p(A) acts as a selector—A is bound to X, (c) The 
second occurrence of p(A) acts as a constructor—Y is bound to its copy (a “real" instance), 
(d) q(p(A) f B) and q{Z T r(Y)) both act as selectors, but p(A) and r(Y) are constructors—both 
terms are now t(p(X), p(X), q(p(XK r(p(X»)) T represented by a mixture of “real" and “vir¬ 
tual" instances, {continued) 
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be trees, and more general DAGs are implemented by variable bindings. 
In general, the process of term creation can give rise to several copies of a 
single term instance (p(A) in the example), but these are indistinguishable. 

The process of copying an uninstantiated variable might seem a little 
roundabout: why don’t all copies simply contain pointers to the variable 
frame location? Indeed, why are there any copies at all: is not Structure 
Sharing always better? 

Recall from chapter 1 (see Fig. 1.6) that a term’s lifetime may have to 
exceed that of the encompassing clause instance. Yet it is obvious that we 
would like to regard a variable frame—which is created when a clause is 
activated—as a part of the clause’s activation record. If we are careful to 
represent variable-to-variable bindings so that younger variables point at 
older ones rather than the other way round, and if term copies contain no 
pointers into variable frames, then there is no risk of leaving dangling 
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pointers as activation records are deallocated upon procedure completion 
(according to the normal stack regime) 2 . 

The situation is quite similar to that encountered in Pascal, say: an 
object which is to live longer than the procedure which has created it is 
allocated in the heap, i.e. a memory area distinct from the activation 
stack. The NSS heap is called a copy stack. It is a true stack, because term 
copies can obviously be discarded when the program backtracks past 
their point of creation. They can become inaccessible much earlier, too, 
and a garbage collector could be very useful, but it is not essential to 
Prolog as it is to Lisp. One must remember, however, that without gar¬ 
bage collection the copy stack’s size is roughly proportional to the 
amount of time spent in forward execution (without backtracking), and 


2 But this is not always possible (see the next section). 
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that one may need to alleviate that by introducing artificial failures in a 
few well-chosen places (see Section 4.3.2). 

In the simplest form of Structure Sharing, a variable frame is an 
integral part of the representation of a number of term instances, and 
cannot—in principle—be deallocated so long as any of these terms is 
accessible. It must be allocated on a variable stack, which closely resem¬ 
bles the copy stack of NSS (except that a garbage collector, if required, is 
harder to implement). The activation stack is smaller, as it only holds 
control information. 

The most important advantage of NSS is that retention past the mo¬ 
ment of procedure termination concerns only those terms which become 
variable instantiations. With simple Structure Sharing, on the other hand, 
all terms are retained. As it turns out, terms are often used as selectors 
rather than constructors, and clauses frequently propel a computation 
along without creating many long-lived objects. The copy stack is there¬ 
fore usually smaller than the variable stack, and the effects of memory 
requirements being a function of time are much less pronounced with 
NSS. 

Starting with DECProlog-10, many Structure Sharing implementa¬ 
tions take advantage of the difference between terms which must live 
longer than their clauses and those which need not. As a clause is read in, 
it is analysed to detect variables which cannot, under any circumstances, 
be used to form instantiations of variables outside the clause. These are 
classified as local variables, whereas the others are called global. The 
variable names are all local to the clause, of course—the terminology is to 
convey that global variables are long-lived, while local variables may be 
allocated (and deallocated) with the clause’s activation frame. The activa¬ 
tion stack is accordingly referred to as the local stack, and the global stack 
holds global variables. 

A simple, though not necessarily the most subtle, classification crite¬ 
rion is whether a variable appears inside a term (i.e. is not only a proce¬ 
dure’s parameter). For example, in 

a( X, f( Y )) b( X, g( Z )). 

we find that X is local. The rule about directing variable-to-variable 
references towards the bottom of the stack suffices to ensure that its de¬ 
allocation will not leave dangling pointers. The variable Y is obviously 
global, as the clause can “export” it after having been activated by 

a( Something, Variable ). 

The status of Z is uncertain. It can be bound to a variable in b, but we are 
really interested only in those outside variables which outlive a. If the 
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body of a were 
b( g( Z ) ) 

then Z could conceivably be classified as local (according to our experi¬ 
ence, though, allowing such cases could complicate the implementation). 
But as the clause stands, we need to analyse b (assuming it will not 
subsequently be modified!) to check whether g(Z) can be made an instan¬ 
tiation of a variable to which X is bound. For example, with b defined as 

b( V, V ) 

the call 

a( P, Q ) 

would instantiate P «- g(Z) and Q «- f(Y)—both Y and Z would be “ex¬ 
ported.” Variables that do not appear in terms can only be used to carry 
information around the clause; it is safest to assume that all others will be 
used to form structures. 

This assumption does not yet allow Structure Sharing to be really 
competitive with NSS, To achieve this, we must declare our intentions by 
providing so-called mode declarations. In Prolog-10 one writes 

mode member(?, +). 

to inform Prolog that the second parameter of member will never be a 
variable, though its first parameter might be one. This means that the 
procedure 

member( E, [ E | L ]). 

member( E, [ X j L ]):- member( E, L ). 

will not be invoked as a generator of lists, so the compound terms will only be 
used as selectors and all the variables—even those global by the general 
criterion—can be classified as local 3 . 

Providing mode declarations may seem a nuisance, but they are good 
documentation (and are not compulsory). The declarations are static and 
must necessarily be less informative than the dynamic special-case analy¬ 
sis of NSS. In common cases, however, the difference is not detectable 
and this form of Structure Sharing is, in fact, as good as NSS with regard 
to memory utilisation. This does not mean that the two behave identi¬ 
cally. Programs can be written which make any one method almost arbi¬ 
trarily worse than the other (how would you go about devising such a 
program?). 


3 A compiler can also use this information to generate faster code. 
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Structure Sharing tends to be faster, but it is more complicated. There 
is the problem of analysing clauses, utilising mode declarations and ma¬ 
nipulating term handles instead of single pointers. Moreover, system rou¬ 
tines such as clause are harder to write because a clause can contain 
references to local variables and its instance is not therefore a correct 
term* If you want to write a simple memory-efficient interpreter, use 
Non-Structure Sharing* 


6,3* CONTROL 


One of the keys to the success of a Prolog implementation is the 
efficiency of backtracking. Whenever a fail point is established (see Sec¬ 
tion 1*3,2), the computation’s state must be saved, so that it can be 
restored upon failure. Both the saving and the restoration of a state are 
frequent events, which must take place as rapidly as possible. 

The state of a computation can be reduced to the contents of the 
control stack and the heap 4 . Obviously, Prolog’s special requirements 
rule out checkpointing (i*e. dumping memory contents) as a means of 
saving the state. Logging (i,e. recording changes made to the state) is a 
more hopeful technique, as differences between successive states of inter¬ 
est are usually minute in comparison to the amount of information con¬ 
tained in a state* The technique is particularly suitable—and universally 
used—for dealing with the evolution of variable instantiations* Only unin¬ 
stantiated variables can be modified, so the old value need not be re¬ 
membered and it is enough to record a modified variable’s address* 

While logging is also a viable method of handling activation record 
traffic on the control stack, it would not be able to take advantage of the 
disciplined manner in which procedure instances are created and de¬ 
stroyed. A better method, well known since the appearance of (Bobrow 
and Wegbreit 1973), can roughly be described as using the log itself to 
define a new state* 

As a fail point is established, a fail point record is pushed onto a 
special stack. (We are interested in a conceptual description. In practice, 
this stack is often implemented by a chain of pointers threaded through 
activation records.) The fail point record stores information about current 
sizes of memory areas and a pointer to the list of untried clauses likely to 
match the current call* In other words, it contains information essential to 
Prolog’s ability to recommence computations from this fail point* To 

4 The generic terms are meant to emphasize that this discussion is valid for Structure 
Sharing and NSS alike* 
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make this information sufficient, stack and heap areas below the levels 
indicated by a fail point record are treated as frozen, i.e. under special 
protection, 

Binding a frozen variable is allowed, but must be logged by pushing its 
address onto a fourth stack, called the trail (its size is also remembered in 
a fail point record). The control stack, however, is frozen quite literally. 
Whenever a terminating procedure would cause control to be returned to 
an activation record (AR) within the frozen area, a copy of the AR is 
created just above the protected part of the stack. The copy defines the 
current procedure’s environment: an ancestor link provides access to the 
frozen AR of the procedure’s caller. To avoid copying that part of an AR 
which contains variables 5 , the variables of a clause are associated with the ARof 
its caller rather than with its own AR. An AR’s copy will be used to perform a 
new call: the original describes the previous call, so its variables are irrelevant. 
All this is illustrated in Fig. 6.6. 

With these precautions, backtracking consists in undoing bindings 
made after creating the most recent fail point record FR (a simple matter 
of resetting locations referenced in the top-most fragment of the trail), 
popping all stacks to the levels indicated by FR, grabbing the untried 
clause list and popping FR itself. This is rapid enough; the unescapable 
penalty is that of maintaining (several copies of) frozen substacks which 
would normally disappear with the shortening of call chains. One of the 
reasons why judicious use of the cut is so important (see Section 4.3.1) is 
that it allows Prolog to reclaim stack storage. To invoke the cut is to pop a 
number of fail point records, thereby unfreezing areas of memory. 


6.4. TAIL RECURSION OPTIMISATION 


Many programming tasks are inherently iterative. For example, to 
test whether an item is present in a list, we must look at successive 
elements until either the list is exhausted or the item is found. But in 
Prolog we can only define member as a recursive procedure. Recursion is 
more expensive than iteration in that it requires not only time but also 
stack storage which grows linearly with the number of turns. Storage is 
often a scarce resource, and it would be very unsatisfactory if each deci¬ 
sion to traverse a list had to be accompanied by speculations about the 
potential length of the list. In Prolog, using recursion instead of iteration is 
all the more serious because the stack may subsequently have to be frozen. 


s Local variables of Structure Sharing, “virtual" variable instances of NSS. 
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FIG, 6,6 Control stack management: (a) The example program, (b) Clause (iv) is 
invoked. (Solid lines in control stack are the ancestor links, dotted lines are variable bind¬ 
ings. Active calls are underscored, remaining calls in each clause are also shown. The model 
is NSS + ) (c) One step later, d is ready to return, (d) After returning from d and b, (Frame 3 is 
a copy of 1, executing the next call. The variable Z was destroyed when the stack was 
popped—it was just above the freezing level,) (e) After failure, before invocation of clause 
(iii). (f) Clause <iii) and (i) terminated, directive in control. 
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FIG, 6,6 (Continued) 


Tail recursion optimisation (TRO) is the technique of replacing some 
forms of recursion with iteration. Despite its name, it is also useful in 
situations where there is no direct recursion, or even no recursion at all— 
just a long chain of procedure calls. 

The general idea is illustrated in Fig. 6.7. Assume that q is the last call 
in p and that p is deterministic, i.e. there are no fail points between the 
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FIG. 6.7 Tail recursion optimisation. 

invocations of p and q (they were not established or were removed by a 
cut). This means that both activation records are not frozen, and are not 
separated by locations containing useful information. Now, if the acti¬ 
vated clause of q is known to be the last clause matching its call, then 
some of the information in the two activation records clearly becomes 
redundant. The younger frame’s control information is no longer needed, 
because the only thing p can do after q’s termination is return immediately 
to its caller: if q returned to the caller of p, the effect would be the same. 
Similarly, the older frame’s variables (local or “virtual”) will not be 
needed by p. TRO is the technical term for replacing the two—either 
during or after q’s invocation—by one activation record, with q’s varia¬ 
bles and with control information needed to exit from p. If q is the same as 
p (or contains a tail recursive call on p, or the like), many calls may be 
executed without increasing the size of the control stack. (The heap may 
grow, though, if the computation constructs some long-lived objects.) 

Several methods of implementing TRO are described in the literature, 
and we shall not discuss them here. An important feature of some of them is 
that they allow delayed TRO, i.e. merging of our two activation records after q 
performs a cut, even though its initial invocation is not deterministic. In 
Section 7.3,4 you will find one such method, which we favour for its 
simplicity. 
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6.5. BIBLIOGRAPHIC NOTES 

The idea of structure sharing comes from Boyer and Moore (1972). It 
was used in the original Marseilles interpreter (Battani and Mdloni 1973, 
Roussel 1975), which, actually, was preceded by an earlier, experimental 
version (Colmerauer et al. 1972). That interpreter did not have anything 
like fail point records. Though variables were allocated on a separate 
stack, control frames were also—as a rule—popped only on backtrack¬ 
ing. Classification of variables into local and global was introduced with 
the DECProlog compiler. Warren (1977a) is the original reference, see 
also Warren et al. (1977) and Warren (1980b). A preliminary report on the 
first NSS implementation is Bruynooghe (1976). 

The idea of tail recursion optimisation is well known. Bruynooghe 
was the first to use TRO in Prolog, while Warren used a different method 
as an afterthought; see Warren (1980a). 

A good detailed explanation of the implementation principles is Bruy¬ 
nooghe (1982b). It stresses both the similarity of structure sharing to 
conventional handling of procedure instances and the similarity of Pro¬ 
log’s control structures to a proof tree. Van Emden (1982) contains a 
disciplined derivation of the control algorithm, starting from search-tree 
traversal. 

Most implementations merge fail point records and control frames 
into a single type of record. To our knowledge, they were first separated 
in Donz (1979), an early approach to global optimisation, where they were 
talked of as the and-nodes and or-nodes of a search tree. We like the 
separation because it brings to light the fact that backtracking is imple¬ 
mented almost exactly as proposed—in a more general setting—in Bo- 
brow and Wegbreit (1973), the classic paper on implementation of uncon¬ 
ventional control structures. 

A comparison of NSS and structure sharing can be found in Mellish 
(1982), with some comments in Bruynooghe (1982b). 

Mellish (1981) is an early approach to automatic production of mode 
declarations by means of global flow analysis. Other papers concerned 
with global analysis, though not for the sake of efficiency, are Bruy¬ 
nooghe (1982a) and Mycroft and O’Keefe (1983). 

At the time of this writing we know of two new compilers being 
developed. The references are Bowen et al. (1983) and Ballieu (1983). 

See also Section 2.5 for references on Prolog implementations with 
coroutining and parallelism. 

As a point of interest, we shall mention two papers describing imple¬ 
mentations of Prolog done by embedding it in another programming lan¬ 
guage: Lisp (Komorowski 1982) or POP-11 (Mellish and Hardy 1983). 







7 TOY: AN EXERCISE IN 
IMPLEMENTATION 


7.1. INTRODUCTION 

This chapter is a case study of Toy—a simple but fairly complete 
implementation of Prolog. Only the most important (or least obvious) 
information is presented here, and it should be read together with the 
source texts available on the diskette enclosed with this book (some of these 
are listed in the appendices). 

While designing Toy, we attempted to strike a compromise between 
several conflicting goals. We wanted to write: 

—A clean, readable interpreter which you could find useful for “getting a 
feel” of what is involved in implementing a “life-size” Prolog system; 
—A usable interpreter, which we could use to test all the programming 
examples in this book (our extant implementations were quite incom¬ 
patible with Prolog-10) and which you might use to experiment with 
Prolog if you have a lot of time but no access to a machine running one 
of the commercially available Prolog systems; 

—A large fragment of the implementation in Prolog itself, to provide a 
sizable example of using the language for solving well-known but not 
completely trivial programming tasks at a relatively low level; 

—An interpreter which, though useful, would have little commercial 
value. 

We decided to use Pascal, because it is easy to read, well known 
and generally available. The program is not written to be very efficient: 
concern for readability and conciseness almost always prevailed. It is 
not particularly short and elegant either, as we wanted it to support a 

185 













186 7 Toy: An Exercise in Implementation 


fairly complete version of Prolog modelled after the Prolog-10 dialect. 

There are two principal reasons why we call it Toy: 

—The user interface is written in Prolog, and this makes it rather slow; 

—There is no garbage collector, and moreover, partitioning storage 
into several disjoint fixed-length areas makes it easier to encounter a 
memory overflow condition. 

If you decide to use Toy, you will quickly find that the time taken to 
read and write terms requires some patience. We had to rewrite read, 
write and op in Pascal for our purposes, and it is but a moderately difficult 
task. A rather straightforward implementation resulted in another 1000 
lines of code, but a lot of it is dedicated to handling mixed functors (see 
Section 7.4.3). 

We used Toy on two minicomputers: a PDP 11/40 look-alike running 
RSXUM, and a Polish computer called Mera 400. The PDP has an ad¬ 
dress space of 64KB; we used it to bring the system up, but it was a tight 
squeeze. You might do better with a P-code system rather than with a 
native-code compiler of Pascal, such as the one we had to use. The Mera 
had a 128KB address space and a fairly good native-code compiler (but 
with no attempt at global optimisation): we could easily load and execute 
both the whole Prolog interface and programs such as WARPLAN or 
Toy-Sequel (see Chapter 8). We tested all our programs and had quite a 
bit of memory to spare, running in a 104KB space. 

The original implementation was subsequently ported (almost painlessly!) 
into Berkeley Pascal (on the VAX/780 running 4.2 BSD UNIX) and into 
TURBO Pascal (on the IBM PC running MS-DOS 2.10). You can find the 
TURBO version on the diskette enclosed with this book. Files READ.ME, 
CONTENTS, INSTALL and TURBO.PAT contain general information 
about the diskette and the implementation. The latter file summarizes changes 
introduced into the original Mera Pascal which was listed in the hardcover 
edition of this book. 

Feel free to run Toy and play with it, but remember it is copyrighted. No 
version of this implementation may be used or distributed for gain, all listings 
must contain our copyright notice, and the heading produced by status (see 
Section 5,7.5) must contain the texts “Toy-Prolog" and “IIUW Warszawa". 
Other than that, you are welcome to modify it, give it to friends, etc. If you 
have any comment to make, we shall be happy to hear from you. 


7.2. GENERAL INFORMATION 

Toy is a Non-Structure Sharing interpreter (see Chapter 6). The pro¬ 
gram written in Pascal supports a limited syntax, which we shall call Toy- 
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Prolog, and only a subset of the usual system (built-in) procedures. The 
full user interface and library is implemented in Prolog (see Section 7.4)— 
this approach was taken in the original Marseilles implementation, and in 
a number of implementations since. A short program called the “boot- 
strapper” (see Section 7.4.1), written in Toy-Prolog, is used to translate 
into Toy-Prolog other parts of the user interface, which are written in a 
slightly restricted form of the usual syntax. Next, various interface pro¬ 
grams can be loaded during initialization (see Section 7.3.6). 

A Prolog program called the “monitor” supports an interactive pro¬ 
gramming regime (see Section 7.4.2). Full Prolog-10 syntax can be used 
(see sections 7.4.3-7.4.5). A program called the “translator” can be used 
to convert Prolog-10 programs into Toy-Prolog (see Section 7.4.6). The 
translator shares most of the monitor’s routines. It can be used for large 
(interactively debugged) programs which are to be loaded quickly, with¬ 
out repeated syntactic analysis by the rather slow parser in the monitor. 
See Appendix A.4 for a few examples of such programs. 

We shall finish this section with an example of Toy-Prolog syntax. 
There is no point in providing a precise description of this language, as it 
is very simple and the recursive-descent parser (a fragment called the 
READER, see file READER.PAS on the diskette) is so straightforward that 
it can easily be used to resolve all doubts. Our example is 

p( [ a, [ b, c ], d | X J, Y ) q( Y, X ), r( s( Y ), _ ). 
p( Z, (tu, v )). 

To make it directly acceptable to the READER, we write 

p( a.( b.c.[] ).d.:0, :1 ) : q( :1, :0 ) . r( s( :1 ), _ ) . [] 

: p( :0, V( t, 7( u, v ))) . []# 

See Appendix A. 2 for further examples. The syntax is not nice, but is 
very close to the internal representation of clauses. 


7.3. THE TOY-PROLOG INTERPRETER 

7.3.1. The Principal Storage Areas 

Toy uses several disjoint areas of memory for its data structures (see 
Fig. 7.1). They are listed below. 

—CT (character table), used to store strings: print names of Prolog func¬ 
tors and predicate symbols; 

—AT (atom table), used to store atoms. In this chapter “atom” does not 
denote a functor with no arguments. It is the generic name of a record 








188 7 Toy: An Exercise in Implementation 



C THIGH 


free 


3- 


«— ctop 




IN 



J 

strings 


» 



CL 


CILOW 


CT 


AT 


ATHIGH 


free 

FTHM3H 

^ tc 

frames 

FTLOW 



free 


trail 

list 


AT10W 


TTHIGH 


JtQp 


TTLOW 


FT 


BT 


IT 


non* 

ground 

prototypes 


free 


ground 

prototypes 


copy 

stack 


free 


variable 

stack 


MTHIGH 


^ngbot 




^prototypes 


-9top 


PROTLOW 
STACK HIGH 

^_cstjot 

*_vtop 


< 


MTU3W 


J 


MT 


FIG. 7.1 The main data areas. 


containing useful information about a symbol (a functor or predicate 
symbol in our case); 

—MT (main table), used to store term instances and prototypes. There 
are two subareas here: 

—Prototype storage, which is further divided into disjoint storage 
areas for ground (variable-free) prototypes and for those that con¬ 
tain variables. The classification is important because a ground 
prototype can be used to represent all its instances, and need not 
be copied onto the copy stack; 

—Stack storage, which is further divided into disjoint areas for the 
copy stack and the variable stack (the variable stack holds varia¬ 
bles from activation records: Pascal’s type mechanism made it 
more convenient to keep control information from activation re¬ 
cords in a separate table FT); 

—FT (frame table), used as the activation record stack (but variables are 
stacked in MT); 

—BT (backtrack table), used as the fail-point stack (here called back- 
track-point stack—we just needed a different letter to label the table) ; 

—TT (trail table), used as the trail stack; 

—Pascal’s heap, used to store procedure descriptors; 

—Pascal’s stack, used for recursion in unification and term-copying oper¬ 
ations. 

In what follows, we shall use the word pointer, or address, to denote 

both Pascal pointers and indices into the tables. 
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7.3.2. The Dictionary: Atoms and Procedure Descriptions 

The character and atom tables form the dictionary: a data structure 
used primarily as an aid in translating between the external and internal 
forms of Prolog terms and clauses. It also supports access to procedures, 
making it easier to implement variable calls and clause manipulation. 

An atom is a record containing information about a functor and/or a 
predicate symbol. The difference between a term and a procedure de¬ 
pends only on context and is not always recognized. Predicate symbols 
are denoted by functors when clauses are treated as terms (e.g. in assert); 
conversely, a functor may be used to invoke a procedure (as in call). 

The attributes of an atom are 


—Its print name (a pointer to a string in CT); 

—Its arity; 

—The procedure of this name and arity (a pointer to a procedure descrip¬ 
tor, or nil). 

Atoms are accessed through direct pointers or through a hashing proce¬ 
dure. Direct pointers are present in the representation of terms (including 
clauses; see the next section). The pointers are used for 

—Printing a functor, 

—Determining arity, 

—Finding a procedure. 


In particular, the representation of a call contains a pointer to an atom as 
the only handle on its procedure. Addition and deletion of clauses in the 
procedure does not therefore require modification of its calls. 

Hashing is used to locate appropriate atoms during conversion from 
external representation. Such conversion takes place when terms are read 
in or when they are created by functor and pname. For simplicity, linear 
rehash is used in the current version: you might wish to improve it. 

Print names are represented in CT by contiguous sequences of char¬ 
acters terminated with EOS characters (zero bytes). As a name is created 
by pname or the READER, its characters are pushed on top of the string 
area in CT (procedure buildname). On termination of the string, wrap- 
name is invoked to locate an atom with the same printname. If such an 
atom is found, the string is obliterated; otherwise a new atom is created 
and returned. Since this atom’s arity is unknown, the arity field is set to 
the special value of noarity (procedure findname). 
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Atoms are located by the READER in a two-phase process. First, 
buildname and wrapname are used to find the first atom with this name; 
then a single scan through the (virtual) hash chain finds an atom with the 
correct arity, or detects its absence and creates it (procedure findatom). 
Conversion between atoms of different priorities, needed to implement 
functor, requires invocation of the hash algorithm to locate the beginning 
of the appropriate hash chain (procedure same name). 

Procedure descriptors are allocated in the Pascal heap. Descriptors 
are formed of lists of records, each of them with: 

—a pointer to the next element in the list; 

—the number of variables in an activation record; 

—either the number of a system procedure, or pointers to the prototypes 
of a clause’s head and body. 

A system procedure descriptor is formed of a single such record. The 
descriptor of a Prolog procedure is a list of records, one for each clause. 
The head predicate’s atom always points at the first element of this list. 

A clause body is represented by the prototype of a Prolog list contain¬ 
ing its calls. Figure 7.2a illustrates the layout (recall that the binary dot is 
the Prolog list constructor). 


7.3.3. Prototypes and Term Instances 

The main table, MT, holds a variety of objects which are distin¬ 
guished partly by their addresses and partly by their contents. Addresses 
are used to distinguish between prototypes and term instances (fields 
denoting variables contain variable offsets in prototypes, and variable 
bindings in term instances). Prototypes of ground terms, which contain no 
variables, are also used as instances: this helps keep down the size of the 
copy stack. 

Instances of non-ground terms are kept in the stack area. It is divided 
into the copy stack and the variable stack. The variable stack holds acti¬ 
vation-record variables and is separated from the copy stack because it 
can shrink on procedure return and not only upon backtracking (see 
Chapter 6). 

Object contents are used to distinguish between integers, variables, 
and “normal” terms with functors. 

—Integers are two-word objects. The second word holds the integer and 
the first—a special marker INT, which prevents the interpreter from 
treating integers as pointers. 
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—Variables hold values less or equal to VARLIM (both INT and pointers 
to MT or AT objects have values greater than VARLIM). VARLIM is 
kept only inside the dummy variable (_) prototype, whose address is 
DUMVARX—this prototype is treated as ground. The value 
FREEVAR (equal to VARLIM - 1) fills free variable instances. Values 
below FREEVAR are negative: in prototypes their absolute values 
denote offsets in variable frames, and in instances their absolute values 



memberf :0, ;0._ ): [] 

memberf :0,_:1 ): memberf :0, :1 ). [] 

(a) The abstract form, (b) The data structures (variable offsets adjusted by offoff; [I/O, 
.12, ,/0, member/2 denote addresses), (continued) 
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FIG, 7.2 {Continued) 


are pointers to variable bindings, (Actually, the situation is slightly 
different: INT = 1, VARLIM = 0 and FREEVAR = -L All negative 
entries denote non-dummy variables. MTs lower index is 2, but vari¬ 
able frame offsets start from 0 and are therefore adjusted by the con¬ 
stant OFFOFF = 2; -2 stands for offset 0, —3 for offset 1, etc,) 
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FIG, 73 Internal representation of terms. 

—“Normal” terms are contiguous sequences of words. The first word 
holds a pointer to the main functor’s atom (its arity field defines the 
length of the sequence). Other words represent arguments. For variable 
arguments see above; other arguments are represented by pointers to 
appropriate objects (see procedures getarity and getarg in the listing). 

Figure 7.3 illustrates these conventions. Figure 7.2b shows the com¬ 
plete internal representation of a procedure. 

As explained in Chapter 6, term instances are pushed onto the copy 
stack only when absolutely necessary (when they become variable bind¬ 
ings) and are otherwise represented as in Structure Sharing. It is therefore 
convenient to represent all instances by a pair of pointers. If the first 
pointer addresses a prototype, the second (which we shall call the proto¬ 
type’s environment) is a pointer to an area in the variable stack, If the first 
pointer addresses a term instance, the second is disregarded. Note that— 
unlike in Structure Sharing implementations—the environment need 
never change as term arguments are accessed; variable bindings are never 
nonground prototypes and require no environment. 

The normal mechanisms of object recognition and creation are cir¬ 
cumvented in two major cases (see procedure loadsyskernel). 

—To avoid creation in the copy stack of too many integer objects repre¬ 
senting intermediate results, a range of the most frequently used inte¬ 
gers (— 1.. 10 in this version) is maintained in the form of unique ground 
prototypes. 

—To avoid the overhead of locating character atoms, checking whether 
functors represent characters, and duplicating character prototypes or 
instances, ground prototypes of ASCII characters are kept in a contigu- 
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ous area of MT. Accessing a character prototype requires only the 
addition of its ordinal number to this area’s address. 

Certain other objects also have representations at addresses known to 
the interpreter. Apart from “popular” integers, characters and the 
dummy variable, there is also a prototype of the atom [] (see the begin¬ 
ning of the global variable declarations for a listing of all these addresses). 
Addresses of atoms requiring special treatment are kept in the table STD 
(see the definition of type stdatomid). There is also the prototype of a 
dummy clause, whose body consists of a single call to error/ 1 , located at 
address errcallseq. 

Procedures for handling term representations are quite straightfor¬ 
ward. Only prototype creation might not be immediately obvious. The 
method is quite similar to that used for creating entries in the dictionary. 
A prototype is allocated by invoking initprot with information about the 
main functor. Arguments are then filled in by newparg and newpvararg, 
and the process is terminated by wrapprot. This procedure checks if all 
the arguments are ground—when this is the case, the prototype is moved 
to the ground prototype area, Note that the process is inherently recur¬ 
sive, as argument prototypes may be created before their parent term’s 
prototype is wrapped up. This is why initprot must be used: piecemeal 
allocation would not preserve contiguity. 

A short comment about terminst , the procedure usually used to create 
terms on the copy stack. Non-variable arguments are represented not by 
direct pointers, but by negative values, as if they were all formed by 
instantiating pre-existing variables. This is necessary because the proce¬ 
dure argument (which follows chains of variable bindings to locate the 
final instantiation) expects variable arguments directly inside the repre¬ 
sentation of their parent terms. A recursive call on terminst can return a 
variable and treating the variable as a normal argument—by inserting a 
positive pointer to it—would break the chain of references. (Such things 
are not easily seen, and the erroneous situations are rather infrequent: 
this bug was the hardest to locate!) 

7.3.4. Control 

In Toy, clause bodies are represented as prototypes of lists. The list 
elements are prototypes of calls, and none of them is an integer or a 
variable. While not directly related to the external form of clauses in 
Prolog-10, this representation is very regular and easy to handle. 

The method of representing control state is almost exactly like that 
described in Chapter 6. The principal difference is that the variable part of 
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f rorenheap = 5000 
csbot=4999 



frowivare-4 

ancenv=2 


an activation record is kept on a separate stack. Figure 7.4 is a detailed 
version of Fig. 6.6d. We shall comment only on the variables used as 
“control registers”. 

The crucial variables are: 

— topf, a pointer to the current control frame (i.e. activation record), 
which is always on top of the stack in FT; 

— topb, a pointer to the current backtrack point (i.e. fail point) record, 
which is always on top of the stack in BT; 

— csbot , a pointer to the first free location below the copy stack in MT 
(this stack grows downwards); 

— vtop, a pointer to the first free location above the variable stack in MT; 
— ttop, a pointer to the first free location above the trail stack in TT. 

Five auxiliary variables contain copies of information available else¬ 
where. They are used for efficiency: 

—ancf is a pointer to the current control frame’s parent frame; 

—topenv is a pointer to the current variable frame (associated with topf); 
—ancenv is a pointer to the variable frame associated with ancf; 
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—frozenheap is a pointer to the first free location below the frozen part of 
the copy stack; 

—frozenvars is a pointer to the first free location above the frozen part of 
the variable stack. 

Execution of a Prolog program is driven by the procedure resolve. 
Each turn of its loop is an attempt to match a call against a clause head, or 
to execute a system procedure. At the beginning of this step the situation 
is as shown in Fig. 7.4: a control frame for the current call is on top of the 
stack, but the clause is not yet invoked and the associated variable frame 
is empty. If the call was an erroneous system procedure call, the error 
handler is activated (see below). 

If the step is unsuccessful (the head did not match the call, or the 
system procedure failed), the interpreter backtracks. Otherwise it enters 
the procedure or—if it was a system procedure or a unit clause—exits it. 
Entering a procedure consists in setting up the control state so that the 
next call to be executed will be the first call in the freshly activated clause. 
Exiting is the process of finding the next pending call: either the one 
immediately following the successful current call, or (if this was the last in 
its clause) a call following the nearest ancestor which is not the last call in 
its clause. 

To stop the execution, the flag stop must be set. This is done either by 
the system procedure halt, or by backtrack when there are no fail points 
left (i.e. when the directive failed) or by exitt when it cannot find a pend¬ 
ing call (i.e. when the directive succeeded). 

Two auxiliary variables play the role of a program counter: 

—ccall contains a pointer to the prototype of the current call (it is the 
prototype of the first element in the list indicated by the current control 
frame’s calls field, unless that element is an invocation of call or tag: 
ccall is then the outermost argument which is neither of these); 

—cproc contains a pointer to the descriptor of the procedure invoked by 
ccall (for Prolog procedures, this is the first clause’s descriptor when in 
forward execution, and a pointer recovered from a backtrack point 
record’s resume field when immediately after a failure). 

Notice that a fail point’s resume field points at the predecessor of the 
clause which is to be retried. This is so to make retract correct. 

The algorithm used for tail recursion optimisation (procedure 
trooverlay ) merits some explanation. We employ the naive method sug¬ 
gested by Fig, 6.7, After unification is over, procedure candotro checks 
whether the current call is an untagged tail call and whether the ancestor 
frame is not frozen. If so, neither the call nor the variable frame associ¬ 
ated with the ancestor frame will ever be needed again, The current 
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variable frame is shifted to replace the ancestor variable frame, and the 
control stack is popped so that the ancestor control frame becomes top¬ 
most (the most recently activated clause is still accessible through cproc). 
The algorithm is made a little complicated by the fact that the shifted 
variables may be instantiated to one another or to the destroyed (overlaid) 
variables. Both cases are illustrated in Fig. 7.5. 

The cut procedure simply removes as many backtrack point records 
as necessary (possibly none) to ensure that the call invoking the procedure 
containing the cut—and all subsequent calls—will not be retried. (There 
are exceptions to this rule: notice that ,/2, ;/2 and call/l are transparent to 
the cut.) After popping off backtrack points, the interpreter must purge 
the topmost section of the trail to remove references to variables which 
are no longer frozen. This is necessary, because such variables can be 
popped off, or shifted, during TRO. Notice that the method of TRO applied 
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(variable stack) 

FIG. 7.5 Tail recursion optimisation: merging two frames, (a) The initial situation. 
Both frames are not frozen, the call is tail recursive. The variables at 57 and 58 are instan¬ 
tiated to the same free variable, the variable at 59 is instantiated to the variable at 44. (b) 
Adjustment pass. (0 The first variable (at 57) points at an overlaid free variable (at 56). The 
direction of the pointer is reversed, (ii) The second variable (at 58) is dereferenced to that at 
57 through that at 56. The reference is remapped: the second variable points at 55 —the 
future location of the variable now at 57, (iii) The third variable (at 59) is dereferenced to that 
at 44 through that at 55, (c) Shifting pass overlays the parent's variable frame with the 
current variable frame; the parent's control frame becomes current, (continued) 
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FIG, 7,5 ( Continued) 
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here makes it fairly easy to perform delayed frame merging after things 
are made deterministic by the cut. We shall not enter into the details of 
this and of tagcut: this is a simple exercise. 

The last thing worth mentioning is the handling of erroneous calls to 
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system procedures. This involves pushing a dummy variable frame, with 
a single variable instantiated to the erroneous call. The current control 
frame (in which the call was invoked) is associated with this variable 
frame and becomes the ancestor of error/ 1 . As a result, the parameter of 
error! 1 is the right instance of the erroneous call. The process is illus¬ 
trated in Fig. 7.6. 

The program maintains several important invariants, such as “there 
are no outside references to non-frozen variable frames except from vari¬ 
ables higher in the variable stack”. We decided to let you have the fun of 
discovering them for yourself (after all, these are the real trade secrets). 


7.3.5. System Procedures 

We shall not give a detailed description of the system routines. There 
are too many of them, and the listing is more or less self-explanatory. The 
general principles are as follows: 

—All system procedures are invoked through procedure sysroutcali, 

—sysroutcali sets up pointers to their parameters in table SPAR (the 
values of integer parameters are also passed through table SPARV); 
—System procedures that can fail or succeed indicate the result by setting 
a Boolean parameter ( success ) passed by sysroutcali ; 

—Whenever a system procedure detects an error, it sets the global flag 
syserror, which forces the interpreter to invoke error/1 (see the end of 
the previous section). 

There are no tricks, except in the procedure concerned with creating 
new clauses. It is important that several occurrences of a variable be 
represented by occurrences of the same offset when a term is translated 
into a prototype. To achieve this, addresses of variables appearing in an 
asserted clause are stacked in the free area above the topmost variable 
frame. With each variable occurrence, this temporary variable dictionary 
is searched linearly and, possibly, augmented. The position of a variable 
in this dictionary is treated as its offset. 

To add a new system procedure, one must: 

—Write its code; 

—Insert its identifier in type sysroutid (its place there defines its position); 
—Insert its call in procedure sysroutcali (in the same position); 

—Insert its name and arity in the kernel file (in the same position)—see 
the next section. 
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FIG, 7,6 Handling erroneous calls to system routines, (a) weft detects an incorrect 
argument: a(V). (b) a call to error!\ is set up. 
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7.3.6. Initialisation 

Initialisation is done in three phases. First, most of the variables are 
set by procedure inilvars. Then two portions of data are read from the so- 
called "kernel file". One portion defines the names and arities of standard 
atoms whose addresses must be known to the interpreter. They are cre¬ 
ated and their addresses are stored in table STD. The other portion de¬ 
fines the names and arities of system procedures: as the atoms are cre¬ 
ated, they are associated with system procedure descriptors. The number 
and order of all these atoms is known to the interpreter. Arities are impor¬ 
tant, but printnames are arbitrary and can be changed at will. 

The last phase of initialisation consists in creating a number of stand¬ 
ard objects. Their addresses are known to the interpreter but they cannot 
be created before the addresses of standard atoms are fixed. The objects 
are: 

—The prototype of [j; 

—The prototypes of characters; 

—The prototypes of the integers -I, 0, ..., 10; 

—The dummy clause body used to invoke error (it is the prototype of 
[error(X)]); 

—The prototype of user, needed by the stream switching procedures (see 
Section 5.7.1). 

After initialisation, the interpreter begins normal execution, reading 
the current file. This is normally the kernel file, containing some useful 
library procedures. One can also append the bootstrapper or the trans¬ 
lated monitor (see below). 

7.4. INTERPRETATION OF PROLOG-IO 
IN TOY-PROLOG 


7.4.1. Intermediate Language 

Even a modest program in Toy-Prolog can be unmanageable. To write 
the monitor, we use a subset of full Prolog, without operators and gram¬ 
mar rule notation. Commas and the symbol are treated as separators. 

List notation is allowed, with one restriction: an X in [.. j X] must be a 

variable. This subset is translated into Toy-Prolog by a “bootstrapper” 
written in Toy-Prolog. Debugging and testing the monitor required fre¬ 
quent retranslations of its small pieces, but the gain in readability was worth 
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this extra effort. Of course, once the monitor works, the bootstrapper is 
no longer needed. 

The bootstrapper is listed in Appendix A.2. Comments starting with 
%% associate mnemonics with variable numbers. The main procedure is 
translate (lines 2-13), with two parameters—the names of the source and 
output files. The unit processed with each turn of the failure-driven loop is 
a single clause or a comment. The loop stops upon encountering a @ in 
place of the first non-blank character of a unit. The translation of a clause 
is a string which is built “on the fly” on a difference list of characters; the 
list is represented by the two parameters christened termrepr and rest- 
of-termrepr. Here is how the clause in lines 54-55 would look after 
rewriting it into full Prolog and combining those parameters: 

ctailaux( Fterm_firstch, Termrepr -- Rest_of_termrepr, 

Sym_tab ) :- 

fterm( Fterm_firstch, Fterms_firstch, 

Termrepr - [’ V [ Middletermrepr ], Sym^tab ), 
fterms( Fterms_firstch, Middletermrepr -- Rest_of_termrepr, 
Sym.tab). 

Fterms-firstch is the first non-blank following a functor-term; in a correct 
clause, it can only be a dot, or a comma (see lines 58-66). 

Comments embedded in a clause are copied at once (lines 50-53). 
Moreover, the string contains end-of-line and blank characters which 
improve the appearance of the translation. 

Error in a clause causes a message to be printed and the input up to 
the nearest dot to be reprinted and skipped (see lines 15-21). The program 
assumes the data are correct, and protests upon encountering the first 
unexpected character. 

Output for each clause with variables is followed by a comment that 
associates variable numbers with source names taken from a symbol table 
for this clause (lines 219-226), The table is an open list of names. Their 
positions are used as variable numbers in the translation. Up to 99 varia¬ 
bles can occur in a clause. The number-name pairs are written six in a 
line (line 224). 

There are some other minor points worth noticing. For example, the 
output string gets closed eventually due to the [] in the initial call on 
clause (line 11); translations of lists within lists are parethesized, see the 
fifth parameter of term (lines 131, 136, 137); identifiers are enclosed in 
quotes by /term (lines 69-70); etc. etc. However, the rest of the program 
should be self-explanatory. A hint: it can be viewed as a metamorphosis 
grammar used for synthesis, driven by input data, with the two compo- 









7.4. Interpretation of Prolog-10 in Toy-Prolog 203 


nents of a difference list serving as an input and output parameter (see 
Section 3.1). 


7,4.2, Overview of the Monitor 

The core of the monitor is an implementation of the built-in procedure 
read that is used in user programs (see Section 7.4.3). The user 
communicaes with Prolog via an interactive “driver” which operates in a 
loop terminated by executing the procedure stop. In each cycle the driver 
prompts the user with 

?. 

and then reads and executes a directive. The symbol table (returned by 
the two-parameter read; see the end of the next section) pairs source 
names of variables with variables proper. After successful execution, 
the symbol table is used to display final instances of these variables, and 
the driver awaits a printable character. If it is a semicolon, execution 
resumes with forced failure, else processing of this directive terminates. 

A directive can be prefixed with :- (we call such a directive a com¬ 
mand, and that without the prefix a query). It will then be executed deter¬ 
ministically, and variable instances will not be printed. However, neither 
a non-unit clause nor a grammar rule make sense when read directly by 
the driver: a two-parameter procedure :- or --> (presumably undefined) 
would be called. User procedures can be defined by calling the built-in 
procedure consult or reconsult ; both are implemented in the driver. In 
“consult mode”, term L ™> R is treated as a grammar rule and translated 
by the procedure transLrule (see Section 7.4.4). A one-parameter term 
:-C is treated as a command and executed. Other terms are treated as 
program clauses. 

The monitor is listed in Appendix A. 3. 


7.4.3. Reader 

The syntax of Prolog-10 is only deceptively simple, so the reader is 
rather involved. One wonders whether a simpler syntax would necessar¬ 
ily be less user-friendly. 

The main component of the reader is a parser which produces internal 
representations of terms on input (Appendix A.3, lines 90-332). Transla¬ 
tion of an internal representation into a term proper is quite straightfor¬ 
ward (look at the listing of the procedure maketerm, lines 334-357, after 
reaching the end of this section). 
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The parser is a classic operator precedence parser; those parsers 
belong to the “shift-reduce” class—they are bottom-up and determinis¬ 
tic (Gries 1971, Aho and Ullman 1977). 

Recall that, roughly, an operator precedence grammar has no produc¬ 
tion with two consecutive non-terminals, and all its productions are such 
that a shift-reduce parser can determine the handle by comparing 
neighbouring terminals in a sentential form. This is possible when each 
pair of terminals is in at most one of the three relations denoted by <, =, 
>. The relations are defined as follows (p, q are terminals, U, V, W non¬ 
terminals): 

—p = q if there exists a production of the form 

U-» pq ■” 

or U p V q ••• 

—p < q if there exists a production of the form 
U->-pV - 

where q-orWq- can be derived from V 
—p > q if there exists a production of the form 

U V q - 

where p or ••• p W can be derived from V 

A parser shifts (i.e. scans a sentential form from left to right) until it 
detects a pair of terminals related by >. It then scans backwards until the 
nearest pair of terminals related by <. The < and > are assumed to be 
brackets delimiting the handle in a canonic parse: the handle is reduced 
and the process continues. 

Note that <, = and > have nothing to do with the common number¬ 
ordering relations. However, if terminals are operators as in arithmetic 
expressions, these relations reflect operator priority: the grammar is 
structured so that higher priority operators (with operands) are reduced 
first. The situation is similar in the case of Prolog “operators” (even 
though in Prolog-10 weaker operators are given the higher priority). We 
shall say—very informally—that f is weaker than g if f < g or g > f. But 
note that, for example, + <(,( = ), and + > ). 

We shall now return to our program. We assume that the input is 
delimited by two additional operators. The rightmost delimiter is weaker 
than any operator to its left; the leftmost is weaker than any operator to its 
right (except the other delimiter). Notice that an empty input is erro¬ 
neous. 

The parser maintains a stack of symbols. Initially the stack contains 
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only the leftmost delimiter. The first true terminal becomes the current 
input terminal. In each step, the current input terminal is compared to the 
topmost terminal on the stack. Three situations are possible: 

1. The input is erroneous—the parser stops “with error”; 

2. The topmost terminal is stronger—there must be a production with the 
righthand side consisting of a number of topmost symbols on the stack; 
we reduce the stack by replacing all these symbols with a correspond¬ 
ing lefthand side; 

3. The topmost terminal is not stronger, i.e. no righthand side has been 
completed—we shift the current input terminal onto the stack and 
make the next terminal current. 


Our operator grammar of Prolog-10 terms assumes seven classes of 
terminals and one class of nonterminals, t (for terms). Parameters of 
symbols are used to build the internal representation of a given term. 

Terminal symbols are read by a scanner (see Appendix A.3, lines 
361-480). The procedure absorbtoken (lines 379-409) reads and con¬ 
structs a “raw” token: 

—id(NameString) 

•—qid(NameString) 

—var(NameString) 

—num( NumberString) 

—str(String) 

—br(LeftRight, Type) 

—bar 
—dot 


from words, symbols, and solo-characters; 

from quoted names; 

from variables; 

from integers; 

from strings; 

from brackets (LeftRight is I or r. 

Type is ’()’, [], or ’{}’>; 
from |; 

from a full stop. 


Next, the procedure make token (lines 457-480) constructs a terminal 
symbol: 


—vns(Variable) 

—vns(Number) 

—vns(String) 

—ff(Name, Types, Priority) 
—id(Name) 


from var(NameString); 
from num(NumberString); 
from str(String); 

from id(NameString) (when this functor 
is an operator); 

from id(NameString) (when this functor 
is not an operator) and from qid(Name- 
String) (i.e. a quoted name never de¬ 
notes an operator); 
from br(LR, T); 


—br(LR, T) 
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—bar from bar; 

—dot from dot. 

The terminal symbol dot is used as the rightmost delimiter of the input. 
The leftmost delimiter (and the seventh terminal) is bottom. It is never 
returned by the scanner: the parser’s main procedure, gettr, pushes it 
onto the initially empty stack. Both delimiters never appear in produc¬ 
tions. 

The Types argument of ff is a list of functor types: [Binary], or 
[Unary], or [Binary, Unary] (see the definition of the built-in procedure 
op lines 656-718)). 

A symbol table in an open list is used to relate a variable’s name to a 
Prolog variable. 

The grammar underlying the parser is given in the listing (lines 99- 
107). The definition of internal representation can be read off the reduce 
procedure (lines 158-179). Incidentally, the procedure can be para¬ 
phrased as a metamorphosis grammar. For example, the fifth and sixth clause 
would be rewritten as 


t( tr( Type, X ))-*•[ br( 1. Type ) ], t< X ), 

[ br( r, Type ) ]. 

t( bar( X, Y )) -* [ br( 1, [] ) ], t( X ), [ bar ], 
t( Y),[br( r, []) ]. 


Notice, however, that top-down analysis based on such a grammar would 
not be deterministic. 

There are five types of internal representations: 


—argO(X) 

—trl(Name, X) 

—tr2(Name, X, Y) 


for X a variable, name, string, or nullary functor; 
for a prefix or postfix term (X is the representa¬ 
tion of the argument); 

for an infix term-(X, Y are the representations of 
the arguments; in particular, the comma is an in¬ 
fix functor, so “comma-lists” of terms are repre¬ 
sented with tr2—for example, the representation 
of 

a, b, c 


is 

tr2( arg0( a ), 

tr2( argO( b ), arg0( c ))); 

—bar(X, Y) for a list with front X and tail Y; X is often the 

representation of a comma-list; 
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—tr(Name, X) for all other valid situations: 

trC()\ X) is equivalent to X; 
tr([], X) represents a list (of definite length), X 
usually represents a comma-term 
tr(’{}\ X) represents the term {(Cond)} where 
Cond is the term represented by X (this is 
used in grammar rules); 
tr(Name, X) with Name other than a bracket 
type (and X—usually the representation of a 
comma-term) represents a normal term; for 
example, the term 

foo( ”p”, 5) 

is represented by 

tr( foo, tr2( argO( [ p ]), arg0( 5 ))) 

The parser’s entry point is the procedure gettr (lines 125-127), and 
the main loop is implemented as the procedure parse (lines 129-138). The 
loop terminates successfully when the original input (bottom and dot 
included) reduces to the sequence 

* 

bottom t( IntemalRepresentation) dot 
The parser fails in two situations: 

—when the procedure establish-precedence fails, i.e. when the topmost 
terminal on the stack and the current input terminal do not compare; 
—when the procedure reduce fails, i.e. the top segment of the stack does 
not match any production. 

The procedure topterminal (lines 140-143) returns Top, the topmost 
stack terminal, and its position: 1 means Top is the top item, 2 means it is 
covered by a t(_). 

The precedence relations are summarized in Table 7.1. We treat all 
operators jointly with respect to other terminals. Empty slots signify erro¬ 
neous combinations of contiguous terminals. 

A functor-functor relationship is the only potentially conflicting 
one: to establish the precedence relation for a given Top and Input, we 
must consider their priorities and types (sometimes even some broader 
context should be considered but this might require changes in the other¬ 
wise deterministic algorithm). If the priorities differ, the functor with 
lower priority is taken as stronger, according to the conventions of Pro- 
log-10. (Notice, however, that when Top is stronger, Input cannot be a 
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TABLE 7.1 

Precedence Relations for the Operator Grammar of Terms 


^\,!nput 

Top 

vns 

id 

( 

) 

[ 

1 

1 

{ 

} 


bottom 

dot 

vns 




> 


> 

> 


> 

>b 


> 

id 



= 

> 


> 

> 


> 

>* 


> 

( 

< 

< 

< 

- 

< 



< 


< 


> 

> 




> 


> 

> 


> 

>b 


> 

i 

< 

< 

< 


< 

= 

= 

< 


< 


> 

i 

< 

< 

< 


< 


= 

< 


< 


> 

i 




> 


> 

> 


> 

>* 


> 

{ 

< 

< 

< 


< 



< 

= 

< 


> 

} 




> 


> 

> 


> 

>b 


> 

if 

<* 

<* 

<a 


< a 



<* 


& -ft 

V A 


> 

bottom 

< 

< 

< 

< 

< 

< 

< 

< 

< 

< 

< 

> 

dot 














“ Top can be any prefix or infix functor, i.e. Types = [xf] and Types = [yf] are ex¬ 
cluded. 

* Input can be any infix or postfix functor, i.e. Types = [fx] and Types = [fy] are 
excluded. 

prefix functor, and when Input is stronger, Top cannot be postfix.) If Top 
and Input have equal priorities, their types must be examined (see below). 

Mixed 1 functors require special treatment. In most contexts, their 
inherent ambiguity is apparent: only one of a functor’s types can be 
properly attributed to it. For example, let Input be &, an [xfy, fy] functor, 
and Top a left parenthesis not covered by a non-terminal: 

. (& . 

Surely, & can only be a prefix variation of this mixed functor—an infix 
variation is excluded. Likewise, if Top is $, an [xfx, xf] functor, covered 

1 Recall that our version of Prolog allows a mixed functor to have only one binary and 
one unary type, both with the same priority. 
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by a non-terminal, and Input a right bracket: 

.$ Term ]. 

then $ certainly cannot be a postfix functor. In such situations, we can 
“disambiguate” the mixed functor by removing the incompatible type 
from its representation. For example, we replace ff(’&\ [xfy, fy], Prior¬ 
ity) with ff(’&\ [fy], Priority). 

The relation in Table 7.1 is implemented by the procedure es¬ 
tablish-precedence (lines 195-204), which takes the two terminals and the 
position of Top. It fails given an incorrect combination, otherwise it suc¬ 
ceeds with the fourth parameter instantiated as gt (Top is stronger) or Iseq 
(Top is not stronger). When both terminals are mixed functors, the proce¬ 
dure tries to disambiguate their types. The last two parameters are instan¬ 
tiated as the new top and new input terminal, to be used in the next step 
(usually they remain unchanged). 

The real job is done by the procedure p which returns gt or Iseq, or— 
when functors are involved—gt(NewTop, Newlnput) or lseq(NewTop, 
Newlnput). It fails given an erroneous pair of terminals. 

Table 7.1 has 80-odd nonempty entries, but it can be easily simplified. 
First of all, we can treat bottom and dot separately; see the last two 
clauses of p (lines 240-241). Next, we consider slots with “=”—the first 
four clauses (lines 206-209) take care of this, and the remainder of p can 
operate with the six slots cleared. Now we are left with a 10 x 10 table 
with three different rows and three columns. Table 7.2 depicts the situa¬ 
tion after combining identical rows and columns. 

TABLE 7.2 


Simplified Precedence Relations 


^\Input 
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ff 
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‘ Top and Input cannot be separated by a non¬ 
terminal. 

* Input cannot be a prefix functor, 
r Top cannot be a postfix functor. 
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The next six clauses of p (lines 211-222) take care of the six noncon¬ 
flicting slots in Table 7.2. The procedure restrict (lines 265-271) is used to 
test and possibly disambiguate the type of a functor. The procedure per¬ 
forms set subtraction for sets given as lists; it will fail if the difference is 
an empty set. 

Now we must try to resolve a conflict in the remaining slot. A closer 
look at the grammar allows a refinement of this slot (see Table 7.3). The 
12th and 13th clauses of p (lines 229—238) are responsible for situations 
when the priorities differ. Again, we also attempt a disambiguation of 
types. 

The 11th clause (lines 225-227) applies to functors with equal priori¬ 
ties. Table 7.4 shows the precedence relation in this case. We allow all 
combinations that can be disambiguated without analysing broader con¬ 
text to the left or to the right of the two functors. For example, an xfy 
functor f is weaker than an xfx functor g because the term 

AfBgC 

cannot be interpreted as 

(AfB)gC 

—g’s left argument would have, incorrectly, the same priority as g. 

The relation of Table 7.4 is implemented by the procedure ff.p (lines 
319-332), which returns Iseq, gt or err. Conflict resolution is performed by 
the procedure res-confl (lines 273-291), which also returns Iseq, gt or err 
(err is later rejected by do-rels called in p). It also returns disambi¬ 
guated—sometimes unchanged—functors. 

If only one of the terminals is a mixed functor, we choose a non¬ 
conflicting interpretation by comparing slots in Table 7.4. This is done by 

TABLE 7.3 


A Refinement for Two Operators 






Top 

prefix 

infix 

postfix 



< 

< 

prefix 

< 

> 

> 



< 

< 

infix 

< 

> 

> 

postfix 


> 

> 















7,4. Interpretation of Prolog-10 in Toy-Prolog 211 


TABL£ 7.4 

Precedence Relations for Operators with Equal Priorities 


Input’s 

type 

Top’s 

type 

xfy 

xfx 

xf 

yfx 

yf 

fy 

fx 

yfx 




>* 

>* 



xfx 




>* 




fx 




>a 

>a 



xfy 

<“ 

<“ 

< fl 



<* 

<* 

fy 


<fl 

< fl 



<* 

<* 

yf 




>* 

>* 



xf 




>* 

>* 




* Top and Input must be separated by a non-terminal, 

* Top and Input must not be separated by a non-terminal. 


the procedure match-rels (lines 297-300). For two mixed functors we 
extract a subtable of relations for each possible pair of interpretations; see 
Table 7.5 (and lines 286-289). The situation is clear if all four slots are the 
same. Otherwise there are only four patterns which can be correct: when 
one of the rows or one of the columns contains two err slots. Details—in 
the procedure res-mixed (lines 302-317). 

The procedure read(Term, SymbolTable) performs the two phases of 
the reader—see lines 65-69. It returns the symbol table with variables 
from this term. The table is used by the interactive driver (see Section 

TABLE 7.5 

The Subtable Template for Two Mixed Functors: 
the Binary and Unary Types Are Compared with 
Each Other 



TlnpBin TInpUn 

TTopBin 

TTopUn 

RdBB 

RelBU 

RelUB 

RelBB 
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7.4.2). If data are incorrect, the parser will stop on the first bad symbol 
and readll will skip characters up to the nearest full stop after this symbol 
(which may also be a full stop). The built-in procedure readll simply 
encapsulates read/2. 

7.4.4. Grammar Preprocessor 

The grammar rule preprocessor (lines 482-583 in Appendix A.3) operates 
according to the principles presented in Chapter 3. The list of lefthand side 
terminals (usually empty) is connected to the output variable of the lefthand 
side non-terminal. Calls on the procedure terminal (Section 3.1) are “pre¬ 
executed” for efficiency. By way of explanation, here are two examples. The 
rule 

a-» [ p ], b, { q, r j, c. 
is translated into 

a( [ p | X J, Z ) b( X, [ q, r | Y ) ), c( Y, Z ). 

The rule 

a -> b, [ q, r ], c, [ s ]. 
is translated into 

a( X, Z ) :- b( X, [ q, r | Y ]), c( Y, [ s j Z ]). 

(The translation of a list of terminals is true, absorbed by the next item’s 
translation; see combine, lines 540-542), 

Conditions/actions (other than a single cut) are passed to the prepro¬ 
cessor as ’(}’(C); see the procedure maketerm in the reader, lines 345— 
346). The functor ’{}’ is stripped off by the procedure transLitem, line 
550. 

Righthand sides separated by semicolons are preceded by a non¬ 
terminal defined as 

’ dummy’ —* (]. 

This is necessary when alternatives start with different terminals. For 
example, the rules 

a -* [ p ], b. and a-*[q], c. 
would be translated with 
a( [ p | Y ], Z ) and 


a( [ q | Y ], Z ) 









1 A. [ n te rpre tat ion of Prolog-10 in Toy-Prolog 213 

as a lefthand side* Consequently the rule 
a-» [ p ], b; [ q J, c. 
must be translated as 

a< X, Z);- ’ dummy’ ( X, [ p | Y ]), b{ Y, Z ) ; 

’ dummy’ < X, [ q | V ] ), c( V, Z ). 

For simplicity, this has been applied to all rules with alternatives. 


7.4.5. Library 

The library (Appendix A.3, lines 585-1002) contains definitions of about 
20 built-in procedures (note that several simple procedures are also defined in 
the kernel file, appendix A. 1). Their definitions in Chapter 5 can 
be treated as design documentation. Their implementation is largely 
straightforward. We shall comment on a few not quite obvious passages. 

The procedures clause(Head, Body) and retract(Clause) are “back¬ 
trackable”, i.e. can be used in failure-driven loops that generate or re¬ 
move all matching clauses. Here is a description of the generator (the 
other procedure is programmed similarly). We are going to visit all 
clauses of a procedure and suspend execution each time we get to a 
matching one. This is achieved by setting up a recursive loop with its step 
distributed between two clauses (see the procedure remclsf 7, lines 814— 
822). The first clause does the matching. Upon mismatch, we immediately 
proceed with the second clause, i.e. conclude the step. If the matching 
succeeds, the generator succeeds, too, but with a pending alternative. A 
failure later on resumes the second half of the step. 

The procedures write and writeq both encapsulate the procedure out- 
term(Term, With_or_without_quotes) which first uses numbervars (lines 
623-632) to bind all variables in Term, and next calls 

outt( Term_after_numbervars, Context, With_or_without_quotes ). 

Context specifies the essential features of a functor whose argument is 
Term. If it is not an operator, or there is no external functor, then Context 
is fd(_, _). Otherwise, Context is fd(ff(Priority, Associativity), Dir). Term 
may be to the left (Dir = 1) or to the right (Dir = r) of the functor. Associ¬ 
ativity may be a(l) or a(r) for left- and right-associative functors, and na(l), 
na(r), or na(_) for non-associative functors. Context is tested by the pro¬ 
cedure outJf/5 (lines 933-935) to decide whether Term should be paren¬ 
thesized to avoid ambiguity in the case of equal priorities. Actually, the 
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test—performed by agree (lines 939-943)—is rather crude (see the pre¬ 
vious section!): sometimes we overparenthesize. The parameter of na has 
only been added for homogeneity, but it could be used in a more subtle 
detection of non-ambiguous cases. 

7.4.6. Translator 

The translator of Prolog-10 into Toy-Prolog (Appendix A.3, lines 1004- 
1088) is invoked by the call 

translate( SourceFileName, OutputFileName). 

Commands are translated and also executed (deterministically), so that, 
for example, a declaration of an infix functor affects subsequent parts of 
the input program. The translator terminates (and succeeds) after reading 
in the unary clause 

end. 

The program is quite easy to understand, Only the procedure lookup 
may require an explanation. The table pairs variables of the clause with 
consecutive integers, starting from 0. A variable is a key, so we must use 
the built-in procedure eqvar to locate variables already present in the 
table. The third parameter of the procedure lookup indicates the last 
number encountered (initially, -1), so that only a new variable requires 
one addition. A more simple-minded solution would be to keep only varia¬ 
bles in the table, and count them during lookup. This would require at 
least (n - I) * n/2 additions for a clause with n variables. (In Toy, integers 
are implemented in a particularly simple way, so this might fill the copy 
stack with many dead integers). Another possibility is to apply the procedure 
numbervars —inside put —to Head and Body jointly. 

The translator outputs bare translations. It would be helpful to have 
source comments transferred to the translation, and to get source variable 
names paired with numbers (see Section 7.4.1). Try this exercise for 
yourself. 







8 TWO CASE STUDIES 


8.1. PLANNING 

We shall consider planning with respect to a finite, usually small, set 
of objects to which simple actions from a finite, and also small, set are 
applicable. Objects constitute a closed “world"’. The state of the “world 11 
is, by definition, the set of all relationships that hold between its objects; 
we also call these relationships facts about objects. As a result of an 
action, some relationships cease or begin to hold; we say that an action 
deletes or adds facts. A fact established by an action is also called a goal 
achieved by this action. Every action transforms one state into another. 
Planning consists in finding a sequence of actions that lead from a given 
initial state to a given final state. 

As an example, we shall describe one of the so-called cube worlds. 
There are three cubes, a, b, c, and floor. All we can do with them is stack 
cubes on cubes or on the floor. There are two types of facts concerning a 
cube U and an object W: U is sitting on W, and U is clear (this means that 
nothing is sitting on U). The set of possible states is determined by naming 
all meaningless (t.e. impossible or forbidden) combinations of facts: 

—A cube X sitting on a clear cube Y; 

—A cube sitting on two different objects; 

—Two different cubes sitting on the same cube; 

—An object sitting on itself. 

There is one kind of action: move a single clear block, either from 
another block onto the floor, or from an object onto another clear block 
(the object must differ from both blocks). As a result of moving X from Y 
onto Z, X is sitting on Z instead of Y, Y is clear (unless it is the floor), Z is 
not clear (unless it is the floor). 
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FIG. 8.1 (a) An initial state of the cubes world, (b) A final state of the cubes world. 


Even in this microscopic world, planning may require some sophisti¬ 
cation. It is reasonable to postulate that a desirable fact, once added, will 
never be deleted (otherwise we risk an infinite loop). However, let the 
initial state be that of Fig. 8.1a, described by a conjunction of five facts: 

a on floor, b on floor, c on a, clear(b), clear(c). 

Let the final state be that of Fig. 8.1b, described by a conjunction of two 
goals: c on a, a on b. The first goal is trivially achieved. To put a on b, 
though, we must remove c from a, i.c. destroy an already achieved goal. 
The simple strategy of achieving goals one by one (and freezing all rele¬ 
vant facts) would not work in this case. 

In a more crowded "world”, a state might comprise so many facts 
that its direct representation (as a list, say) would be impractical. More¬ 
over, even a small change might require copying large data structures. 
Clausal representation is free from this disadvantage but it is unwieldy 
when a change must be undone, and of course planning is a trial-and- 
error process. What we need is a method of incrementally describing 
incremental changes, and making them easily undoable. 

A state and an action determine the next state, if we assume that the 
action does not affect facts not mentioned explicitly in the description of 
the action’s effects as added or deleted. Given an initial state and a plan, 
i.e. a sequence of actions, we can check whether a fact holds in the 
resulting final state. To undo an action, we remove it from the plan (in 
practice, this may be slightly more complicated). 

For any particular planning problem, the initial state can be consid¬ 
ered fixed. The final state should be given implicitly, as a conjunction of 
facts to be established by a plan we are going to find, This approach was 
taken by D. H. D. Warren in his remarkable planning program, WAR- 
PLAN. 
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In WARPLAN, a world description is separated from the planning 
procedure (see Listing 8.1, pp. 221-223, lines 1-26, for the description of our 
cube world). Objects are given implicitly, in descriptions of actions and facts. 
Actions are defined by three procedures. The two-parameter procedure 

can( Action, Precondition) 

serves as a catalogue—one clause per action; Precondition is a conjunc¬ 
tion of facts that must hold for Action to be applicable. A conjunction is 
either a fact, or a pair of conjunctions constructed by the infix functor &, 
e.g. c on a & a on b. 

Two other procedures, 

add( Fact, Action ) 
del( Fact, Action ) 

give facts added and deleted by available actions (and, conversely, 
actions which can add or delete a fact). Impossible combinations of facts 
are listed in the procedure 

impo$s( Conjunction) 

In these four procedures, we can use variables instead of world objects to 
express general laws, e.g. “a clear cube U is sitting on a cube V”: 

U on V & notequal( V, floor ) & clear( U ) 

For efficiency, facts that hold in the initial state, and are unaffected by 
any action, are listed in the procedure 

always( Fact) 

Other facts that hold in the initial state are supplied by the procedure 
given( InitialStateName, Fact) 

The initial state is denoted by its name, e.g. start. A state derived 
from it by actions AI, ..., An is denoted by the term 

InitialStateName : Al : : An, 

e-g. 

start : move( c,a, floor) : move( a, floor, b ) : move(c, floor, a) 

The planning program (Listing 8.2, pp. 224-226) operates indepen¬ 
dently of specific world descriptions. It assumes the presence of an appro¬ 
priate data base whose coherence is the responsibility of the user. 
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The program begins with a conjunction of facts (i.e. the description of 
a desired final state) and the empty plan. In each step, the conjunction 
shrinks and/or the plan grows; successive intermediate states approxi¬ 
mate the final state. Roughly speaking, the plan is constructed backwards: 
we look for preconditions of actions that achieve the final state, then for 
preconditions of actions that achieve those preconditions, etc. Unless a 
fact holds in an intermediate state, the program chooses an action that 
adds this fact, inserts the action into the current partial plan, removes the 
fact from the current conjunction and adds to it the action’s precondi¬ 
tions. 

A partial plan usually contains variables. For example, to achieve a 
on b, we use the action move(a, V, b), whose precondition includes the 
fact a on V (for an unknown V). Such variables require some care: the fact 
U on c may, in general, differ from a on V, even though the two terms are 
unifiable. We can either use the built-in procedure == to compare facts, 
or temporarily instantiate their variables (by the built-in procedure n«m* 
bervars) prior to the comparison. 

In addition to the current conjunction and plan, the program main¬ 
tains a conjunction of desirable facts already planned for. No newly in¬ 
serted action can destroy any of these preserved facts. 

The program is amazingly concise. In Warren’s original paper it was 
accompanied by many pages of detailed considerations. Hence, the ab¬ 
sence of proper comments in the program text. Below we shall present, in 
our own words, some indispensable technical explanations. 

The main planning routine, plan, is called only if the final state de¬ 
scription is not inconsistent (lines 10-13), i.e. if it does not imply one of 
the impossible combinations of facts, plan has three input parameters— 
facts to be achieved, facts already achieved (initially true ; see line 13) and 
the current plan—and one output parameter, the final plan. The proce¬ 
dure solve is called for each fact of the initial goal list (see lines 30-32). It 
has five parameters: a fact to be established, preserved facts, the current 
plan, preserved facts after solve has succeeded and the new plan. 

Every clause of solve accounts for a different status of the fact (lines 
35-39). It may be always true; it may be true by virtue of general laws 
external to “worlds” (e.g. equality or inequality of objects will be 
checked by this clause); it may hold in the state described by the current 
plan (to preserve it, we add it to the facts planned for; see lines 83-84); 
otherwise (the last clause) we choose an action and call achieve. 

The procedure achieve (lines 41-49) tries to apply a given action, i.e. 
to insert it into the current plan (as the last action, or as the last but one, 
etc.). The action U is applicable if it deletes none of the preserved facts, 
and if its precondition is consistent with these facts and if a plan for 
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achieving this precondition can be constructed. Notice that possible addi¬ 
tions to P (preserved facts) made by the recursive call on plan are invisible 
to achieve : they are only needed “locally” during the construction of the 
intermediate plan TI. The additional call on preserves (line 45) is neces¬ 
sary because of variables in the plan. For example, the action move(b, a, 
W) need not delete the fact clear(c), so preserves lets it through; however, 
plan may instantiate W as c, and this ought to cause a failure. 

If, for any of these reasons, the action U cannot be added at the end of 
the plan, achieve will try to undo the last action V and insert U earlier into 
the plan. This is only possible if V does not delete the fact to be added by 
U. The procedure retrace (lines 65-73) removes from the set of preserved 
facts all facts that may be established by V but are different from V’s 
preconditions. Specifically, it removes the facts added by V (lines 68-69) 
and the facts that constitute the precondition of V (lines 70-71)—the 
latter facts will be re-inserted by append (see lines 66, 86-87)'. 

A few comments on the remaining procedures. A fact holds after 
executing a given plan (lines 52-55), if it is given or added by one of the 
actions, and preserved by all subsequent actions (if any). Two conjunc¬ 
tions, C and P, are inconsistent (lines 76-78, 93-97) if C&P contains all 
facts of an impossible combination S, except those which—like not- 
equal —are tested "metaphysically" (see line 95). For disjoint C, S this 
cannot be the case—hence the call on intersect which is relatively cheap. 
Two object descriptions X and Y, with variables instantiated by numher- 
vars in mkground (line 101), may refer to the same object if X = Y or 
X = ’V’(-) or Y = ’V’(-)—see line 99. The procedure elem (lines 89-91) 
extracts single facts from a nested conjunction; it can be used both to test 
membership, and to generate facts. 

Now that you have acquainted yourself with the planning program, 
try it on a richer world. Here is the world of a robot that walks around 
several rooms, moves some boxes, etc. (see Listing 8.1, lines 33-101). 
Figure 8.2 depicts an initial state of this world. There are six points, five 
rooms, four doors, three boxes, a light switch, and the robot. Nine types 
of facts are considered: at(Object, Point), on(Object, Box), nexttofOb- 
jectl, Object2), pushable(Object), inroom(Object, Room), locinroom- 
(Point, Room), connects(Door, Rooml, Room2), status(Lightswitch, 
OnOff), onfloor—the latter characterizes the robot. Only the robot per¬ 
forms actions—there are seven of them (see lines 64-77). 

1 The special treatment of V T s preconditions is necessary for actions which add facts 
listed among their own preconditions. If retrace simply deleted V's effects, such precondi¬ 
tions could be lost from the list of facts which must be preserved by U, ’and those parts of the 
plan which achieve ‘locally desirable" goals could inadvertently be destroyed in the inser¬ 
tion process. 
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The procedure del merits a comment. It is supposed to delete more 
than it should—we count on add to straighten the situation out. For 
example, the action turnon(S) removes whatever status of S may be re¬ 
corded (line 54); “a moment later” it adds the appropriate fact (line 39). 
The clauses in lines 49-50 say that a moved object X is no longer “next 
to” anything. However, this does not apply to the robot manipulating a 
box (lines 46-48 )—del fails, i.e. the fact is not deleted. 

For sample results, see Listing 8.3, p. 227. 

Although WARPLAN is a feat of ingenuity, there is much more to 
planning than it does account for. For one thing, the plans it generates 
need not be optimal, i.e. contain the least possible number of actions. For 
example, action U in achieve (lines 47-49) is executed when it preserves 
V’s precondition P; if we checked that U establishes P, we might delete 
actions which had been planned to establish it. A much more profound 
problem: in general, it is likely that conditional or iterative plans will be 
required, rather than sequential (the robot explores the world). 

Even with these (and other) limitations, and despite exponential time 
complexity, WARPLAN is an excellent tool for experiments with rigor¬ 
ous world descriptions. One example is the world of a robot that assem¬ 
bles cars. Warren has also demonstrated how his program can be used to 
compile arithmetic expressions into machine code (the code is treated as a 
plan for placing some values in some registers). 















LISTING 8.1 WARPLAN—Examples of worlds. 


1 % % % % % % WARPLAN - cube worlds 

2 

3 op( 50, xfx, on ), 

4 

5 add( U on W, move( U, V, W) ). 

6 add( clear! V ). move! U, V p W ) ). 

7 

8 del( U on Z, move( U, V, W ) ). 

9 del( clear( W). move{ U t V. W ) ). 

10 

11 can( move( U, V. floor), 

12 U on V & notequaff V, floor) & clear£ U ) ), 

13 can( movef U. V. W ) P 

14 clear( W ) a U on V & notequai( U, W) 4 clear! U ) ). 

15 

16 imposs( X on Y a clear! Y ) ). 

17 impossj X on Y & X on Z & notequal( Y, Z)). 

18 impossj X on Z & Y on Z & notequal( Z, floor) a notequal( X P Y)), 

19 impQSS( X on X ), 

20 

21 % The three blocks problem. 

22 given( start, a on floor ). 

23 given( start, bon floor ). 

24 given£ start, c on a }. 

25 givenf start, clearf b) ). 

26 given £ start, clearj c) j. 

27 

28 plans!con a & aonb, start). 

29 plans! a on b a b on c, start ). 

30 delopf on 1 ), redefine, 

31 %...... 

32 

33 % % % % % % WARPLAN - the STRIPS problem 

34 

35 add( at( robot. P )> gotolf P. R ) >. 

36 add( nexttof robot, X ), goto2( X. R ) ). 

37 add( nexttoj X. Y ). pushto{ X. Y. R) ), 

38 add{ nexttof Y. X ), pushto( X. Y. R) ). 

39 add! status( S. on). lurnon( S) ), 

40 add( on( robot, B), climbon( B } ). 

41 addj onffoor. climboff( B ) ). 

42 add( inroomj robot. R2 ), gothroughf D. R1. R2) ). 

43 

44 del( at£ X, Z }, U ) > movedf X t U ) + 

45 del( nexttof 2, robot), U ) I. del( nextto( robot, Z ). U), 

46 def( nexttot robot. X), pushlo( X. Y. R ) ) > !. tail, 

47 del( nextto( robot, B ) t climbonf B) ) I, fail. 

48 del( nextto( robot. B ) T climbofff B ) ) I, fail, 

49 del£ nexlto(X. Z). U ) moved( X. U ), 

50 del( nextto( Z. X). U ) moved( X. U ). 

51 del£ on( X, Z), U ) movedf X, U ). 

52 del{ onfloor. climbon{ B } ), 

53 del( inroomf robot. Z). gothrough[ D. R1. R2 ) ). 

54 del( status! S, Z ). turnonf S) ). 

55 

56 moved! robot, gotol( P. R ) ). 

57 moved! robot, goto2( X, R ) ). 

58 moved{ robot, pushto( X. Y. R ) ), 

59 moved( X. pushto ( X, Y, R ) ), 

60 movedj robot, climbon( B) ). 
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LISTING 8.1 {Continued) 


61 moved[ robot ciimboff( B ) ), 

62 moved( robot, gothrough( D, R1, R2) ). 

63 

84 can( goto1{ P, R ), 

65 locinroom{ P, R) & inroomf robot. R ) & onfloor ) t 

66 can{ goto2( X, R ), 

67 tnroomf X, R ) & inroom( robot R ) & onfloor ). 

68 can( turnon! lightswitch(S)), 

69 on( robot, box(1)) & nextto! box(1), lightswitch(S)) ). 

70 can( pushtot X, Y, R ), 

71 pushable! X ) & inroom( Y P R ) & inroom( X, R} & 

72 nextto( robot, X ) & onfloor ). 

73 can( gothrough( D, R1 P R2 ), 

74 connects! D, R1 t R2) & inroom( robot R1 ) & 

75 nexlto( robot, D) & onfloor ). 

76 oan( elimboff( box(8)). on( robot, box{B}) ). 

77 can( climbonj box(B) ), nextto( robot, box£B)) & onfloor )■, 

76 

79 always( inroom( D, R1 } ) always! connects! D, R1, R2 ) ). 
60 always! connects! D P R2 P R1 ) J > connects!! D, R1, R2 ), 

81 always! connects! D, R1, R2 ) ) connects!! D P R1 P R2 ). 

82 always! pushable! box{N)) ), 

83 always! Iodnroom( point(N), room{1)) ) range! N*1 t 5 ). 

84 always! locinroomj point(6), room(4)) ), 

85 always! inroom( lightswitch(l), room(1)) ). 

86 always! at( Ughtswitch(l), point(4)) ). 

87 

86 connectsl( door(N} p room(N), room(5) ) range! N, 1,4 ), 

89 

90 range! M, M, _ J. 

91 range! M L p N ) 

92 L < N p LI is L +1, range! M, L1 p N ). 

93 

94 imposst at( X, Y ) & at! X, Z ) & notequalf Y, Z ) ). 

95 

96 given! stripsl, at! box(N), point!N}) ) range! N,l p 3 ). 

97 given! stripsl, at! robot, point(5)) ). 

98 given! stripsl, inroom( box(N), roomtl)) ) range! N, 1,3 ) r 

99 given{ stripsl, onfloor ). 

100 given! stripsl, status! SightswHch(l), off) ), 

101 given! stripsl, inroom! robot, room(1)) )■ 

102 

103 % A few tests, 

104 > plans! at{ robot, point(5)), stripsl ). 

105 > plans! ait robot, point(l)) & at( robot, point(2)), stripsl ). 

106 plans! at! robot, point(4)}, stripsl } r 

107 > plans! status! Ilghtswitch!1), on ), stripsl ). 

108 plans! at {robot, point! 6)), stripsl). 

109 plans! ne xtto { bo x( 1), box(2)) & 

110 nexltof box (3). box!2)), stripsl ). 
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LISTING 8*2 WARPLAN—The general planner 


1 % % % % % % WARPLAN * A System for Generating Plans 
Z % % % % % % ( published with the kind permission of the Author, 

3 % % % % % % David H. D. Warren) 

4 

5 % % % The general planner. 

6 % --— 

7 > op( 200, xfy, & ) p op( lOO.yfx, :). 

8 

9 % Generate and output a plan. 

10 plans(C,_) 

11 inconsistent C, true ), I, write! Impossible.’), nl. 

12 pfans( 0, T) 

13 plan( C, true, T,II ), output! T1 ) p l 

14 plans! _,_J 

15 wrtle(’Cannot do this.'} P nl. 

16 

17 output( Xs:XJ > 

18 numbervars( Xs:X, 1 ( J p output1{ Xs ), outpul2( X T ). 

19 output{ ) write( 'Nothing need be done; ) p nl. 

20 

21 outputlf Xs:X ) > t, output 1( Xs } P output2( X P ' 

22 output! ( X ) outputs( X T '}. 

23 

24 output2( item, Puncl) write! Item write( Punct), nl. 

25 

26 % Main planning routine. 

27 % Definitions of 'always', ’imposs'. 'given', 'can', 'add 1 , 'del 1 ■ 

28 % see specific world descript ions. 

29 

30 plan{ X&C, P t T f T2 ) 

31 ! a solve( X, P, T, PI, T1 ), plan( C, PI, T1, T2 ). 

32 plant X. P, T, T1 ) solve( X, P, T p T1 ). 

33 

34 % Ways of solving a goal. 

35 $olve( X, P, T P P P T) always! X 

36 solve( X, P, T, P, T) X. 

37 solve( X, P, T, PI, T) holds( X, T J, and( X, P, PI). 

38 solve! X, P, T, X&P, Tt ) > 

39 add( X, U ), achieve! X, U, P, T, T1). 

40 

41 % Methods of achieving a goal - 

42 % by extension: 

43 achieve! U. P, T, T1 :Lf) :- 

44 preserves! U, P ) t can( Lf, C }, not inconsistent C, P), 

45 plant C, P, T, T1 J, preserves! U, P). 

46 % by insertion: 

47 achieve! X, U, P p T:V* T1:V ) > 

48 preserved! X t V), retracet P p V, Pi ) p 

49 achieve! X, U PI. T # T1 ), preserved! X, V ). 

50 

51 % Check if a fad holds in a given state. 

52 holds! X, _:V ) :* add( X, V ). 

53 holds! X,T:V) > 

54 !, preserved! X, V), holds! X, T), preserved! X, V). 

55 holds! X,T) > given! T, X ), 

56 

57 % Prove that an action preserves a fact, 

58 preserves! U. X&C J :* preserved! X, U ), preserves! U, C J* 

59 preserves! true). 
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60 

61 preserved! X, V ) check! pres( X, V)). 

62 pres{ X, V ) mkground( X&V ), not det( X, V ). 

63 

64 % Retracing a goal already achieved. 

65 retrace! P, V, P2 ) 

66 can( V, C ), retrace! P f V, C, PI ), append! C, P1 t P2)* 

67 

68 retrace! X&P, V, C, PI ) > 

69 add! Y. V ), X =- Y, !, retrace! P, V, C p PI }. 

70 retrace! X&F, V P C P PI ) 

71 elem( Y f C). X == Y, I, retrace! P p V r 0, PI ). 

72 retrace! X&P, V P C P X&P1 ) retrace! P, V P C P PI ). 

73 retrace! true, _ t true ). 

74 

75 % Inconsistency with a goal already achieved. 

76 Inconsistent! 0, P) 

77 mkground( C&P ) t impo$s( S ) t 

78 check! intersect! C P S )), implied! S, C&P), I, 

79 

SO %%% Utilities. 

81 % - 

S2 

83 and(X p P h P) elem( Y t p) t X — Y, 1. 

84 and! X, P, X&P ). 

85 

86 append! X&C, P. X&P1 ) I, append! C P P, PI ). 

87 append! X P P, X&P ). 

88 

89 e1em( X P Y&_} elem! X p Y ). 

90 elem( X p &C) > f p elem(X,C). 

91 elemj X, X ). 

92 

93 implied! S1&S2, C ) t, implied! SI, C ) a implied! S2, C l 

94 Implied!. X, C ) elem( X, C ). 

95 implied! X* _) X. 

96 

97 intersect! SI P S2 ) > elem( X, SI ), elem[ X, S2). 

98 

99 notequal( X p Y) not X-Y , not X=’V’{J , not Y-VQ - 
TOO 

101 rhkground! X ) numbervars(X p 0, _). 
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LISTING 8,3 WARPLAN—Sample results 


Toy-Prolog listening: 

?* % files without test cases but with end. at the end 
consult( planner), oonsult( cubes ). 

?_ 

plans( c on a & a on b s start). 
start: 

move( c, a, floor); 
move( a. floor, b} : 
move( Cd floor, a). 

?- 

:* plans( a on b & b on c, start )> 
start: 

move( c, 3d floor) : 
move( b, floor, c } : 
move( a, floor, b ), 

?- 

delop<'on'), reconsull( strips). 

?- 

> plans( at( robot, point(5)), stripsl ), 

Nothing need be done. 

?* 

:■ ptans{ at( robot. point<1}) & at( robot, point(2)), stripsl ). 
Impossible, 

7 - 

p1ans( at( robot, point(4)), stripsl }. 
stripsl : 

gotol ( point( 4), room( 1 )). 

?- 

> plans ( status( lights witch(l), on), stripsl ). 
stripsl : 

goto2{ box( 1 ), room( 1 )} : 

pushto ( box( 1 ), lightswttch( 1 ), room{ 1 J) : 

climbon( box( 1)): 

turnon{ lig biswitch ( 1 ) ), 

?- 

> plans{ at{ robot, point[6) ) t stripsl ). 
stripsl : 

goto2( door( 1 ) s room( 1 )): 
gothrough( door( 1 ), roomf 1 ), room{ 5 )) : 
goto2( door( 4), room( 5 )): 
gothrough( door( 4 ), room{ 5 ), room( 4 )) : 
gotol ( point { 6 ). room( 4 ) ), 

?- 

> plans( nextto( box(1), box(2)} & nextto( box{3), box(2)), stripsl )♦ 
stripsl : 

goto2( box( 1), room( 1 )): 
pushto( box( 1 ), box( 2 }, room( 1 )) : 
goto2( box( 3 ), room( 1)); 
pushto( box( 3), box{ 2 ) T room( 1 )). 

?- stop. 

Toy-Pro log, end of session. 
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Bibliographic Notes 

WARPLAN is described in Warren (1974). Our presentation has been 
greatly influenced by this excellent paper. The program we publish here is 
a slightly cleaned-up version of the text given in Coelho et al. (1980), 
where all the mentioned examples of worlds can also be found. The ro¬ 
bot’s world was introduced by Fikes and Nilsson (1971) as a test case for 
their system STRIPS; Warren (1974) used it to compare the performance 
of the two systems. An extension of WARPLAN, intended for generating 
conditional plans, was described in Warren (1976). 


8.2. PROLOG AND RELATIONAL DATA BASES 

In this section, we shall be primarily concerned with data bases in the 
limited sense: a data base is a purposefully structured collection of stored 
data, often pertaining to an organisation (e.g. a bank, factory, university, 
warehouse). In a relational data base all data are conceptually grouped 
into relations, which are usually depicted as rectangular tables as in Fig. 
8.3. A column in the table is called an attribute and referred to by a name, 
e.g. dno. All values of an attribute belong to a common domain, e.g. each 
salary belongs to integers. A relation is a set of tuples (table rows) which 


EMP 


empno 

name 

dno 

salary 

mgrno 

13 

Miller 

0 

1500 

19 

21 

Jones 

1 

1000 
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35 

Brown 
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21 

38 

White 
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35 

43 

Smith 
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13 
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Thomas 
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21 

89 
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35 

42 

Miller 
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35 
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dno 
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mgrno 

l 

PublicRelations 

21 

2 

Security 

43 


FIG. 8.3 Contents of a relational data base. 
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consist of attribute values, e.g. 

< 38, White, 1, 800, 35 >. 

Tuples belong to the set described by a relation schema which specifies 
names, domains and order of attributes, e.g. 

EMP < integer empno, string name, integer dno, 

integer salary, integer mgrno > 

DEPT < integer dno, string name, integer mgrno > 

Two tuples may share the value of an attribute, and thus implicitly fall 
into one group; for example, Brown and Thomas are both subordinates of 
a manager whose number is 21. 

A relation can be changed by inserting, deleting or updating some of 
its tuples. These operations are referred to as data manipulation. 

A query to the data base is answered by enumerating tuples of the 
resulting relation (or by computing an aggregate function, such as “aver¬ 
age” or “total”, over these tuples), Most queries are expressible in terms 
of the following primitive operations on relations. 

—Selection chooses tuples for which a given condition holds; for exam¬ 
ple, we can select from EMP those employees of department 1 who 
earn over 900 (there are three such tuples). 

—Projection neglects some attributes and (possibly) reorders the remain¬ 
ing ones; for example, we can project EMP over name , empno, and 
salary, to get 

< Miller, 13, 1500 > 
and seven other triples. 

—Join of two relations A, B forms a new relation. It consists of those 
concatenations of tuples from A with tuples from B, for which a given 
condition holds. For example, the join of EMP and DEPT, such that 
department numbers coincide, consists of the tuple 

< 21, Jones, 1, 1000, 13, 1, Public Relations, 21 > 
and six other S-tuples. 

—Unconditional join is a product; for EMP and DEPT the product con¬ 
sists of sixteen 8-tuples. 

—Finally, set operations, namely union, intersection and difference, can 
be applied to two relations whose corresponding attributes belong to 
the same domain, i.e. whose schemata differ only in names. 

Much of this conceptual framework is naturally translated into Pro¬ 
log. A relation is modelled as a procedure made of unit clauses which 
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correspond to tuples, for example: 

’EMF( 13, ’Miller’, 0, 1500, 19 ). 

*EMP’( 21, ’Jones’, I, 1000, 13 ). 
etc. 

(we use quotes to prevent capitalized names from being treated as varia¬ 
bles). To change a relation, we use the built-in procedures assert and 
retract. 

Primitive operations on relations are expressed in terms of procedure 
calls. For example, the procedure 

s( Empno, Name, Dno, Salary, Mgrno ) :- 

’EMP’( Empno, Name, Dno, Salary, Mgrno ), 

Dno = 1, Salary > 900. 

can be used to generate all tuples for employees of department I who earn 
over 900 (i.e. to implement selection): 

:- s( E, N, D, S, M ), write( ( E, N, D, S, M )), nl, fail. 

Better still, we can substitute 1 for Dno and remove the test: 

s( E, N, 1, S, M ) :- ’EMP’( E, N, 1, S, M ), S > 900. 

The procedure p can be used to implement projection: 

p( Name, Empno, Salary ) :- 

’EMP’( Empno, Name, _, Salary, _ ). 

The composition of these two operations can be expressed in Prolog quite 
succintly: 

s_then_p( Name, Empno, Salary ) :- 

’EMP’( Empno, Name, 1, Salary, _ ), Salary > 900. 

Or we can put this directly into a query: 

:- ’EMP’( E, N, 1, S, _ ), S > 900, 
write( ( N, E, S ) ), nl, fail. 

Finally, here is the join of EMP and DEPT over coinciding department 
numbers: 

j( Empno, NameE, DnoE, Salary, MgrnoE, DnoE, NameD, 
MgrnoD ) 

’EMP’( Empno, NameE, DnoE, Salary, MgrnoE ), 

’DEPT’( DnoE, NameD, MgrnoD ). 
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All these operations are neatly explained in terms of static interpreta¬ 
tion of procedures (try for yourself!). Set operations are even more 
straightforward. Let a(X,,X n ) and b(X,,X n ) denote generators of 
tuples, such as ’DEPT’( D, N, M ) or p( N, E, S ). We have 

aUNIONb( X,, X„) :- 

a(X,. X„ ) ; b( X.. X n ). 

aINTERSECTIONb( X,, X n ) :- 
a( X,.X„) ,b( X,, ...,X n ). 

aDIFFERENCEb( X,.X n ) :- 

a( X], X n ) , not b( X,, X„ ). 

Queries which involve only primitive operations can be answered 
without actually creating the resulting relation. Its tuples can be generated 
by a failure-driven loop and displayed immediately. To compute an aggre¬ 
gate function, however, we need the whole attribute (column) at once. We 
can construct it by means of the procedure hagof (see Section 4.2.4); for 
example; 

bagof( Salary, ’EMP’( Salary, _ ), Salaries ), 

max_of( Salaries, MaxSal), write( MaxSal), nl. 

Sometimes we can also use bagof for efficiency. For example, we look 
for employees of department 1 who earn no more than a 1000 and who are 
FTU members since at least 1980: 

:- ’EMP’( E, N, I, S, _), S = < 1000, 

’FTU’( E, DateJoined ), DateJoined =< 1980, 
write( ( E, N )), nl, fail. 

With those implementations of Prolog which do not support clause index¬ 
ing, the entire relation FTU would be scanned many times. Instead, we 
can precompute the necessary set: 

:- bagof( Empno, ( ’FTU’( Empno, DateJoined ), 

DateJoined =< 1980 ), FTUMembers ), 

’EMP’( E, N, 1, S, _ ), S =< 1000, 

member( E, FTUMembers ), write( ( E, N )), nl, fail. 

There are queries which cannot be expressed as a composition of 
selections, projections, joins and aggregate functions. A classical exam¬ 
ple: find every employee who earns more than at least one of her/his 
superiors. The relation “is a superior of” is inherently transitive, but we 
can only express the relations “is an immediate manager of”, “is an 
immediate manager of an immediate manager of” etc. In Prolog, how- 
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ever, the problem is easily solved. For example, we can define a proce¬ 
dure to generate managers’ salaries: 

mgr_sal( Mgrno, Salary )’EMP’( Mgrno, Salary, _ ). 
mgr_sal( Mgrno, Salary ) :- 

’EMP’( Mgrno, MgrMgrno ), 

mgr_sal( MgrMgrno, Salary ). 

(try to rewrite it so as to avoid the repeated pass through EMP). From the 
standpoint of the caller, this generator is indistinguishable from those made of 
unit clauses. The query can be written as follows: 

:- ’EMP’( Empno, Salary, Mgrno ), 
once( ( mgr_sal( Mgrno, MgrSal), MgrSal < Salary )), 
write( Empno ), nl, fail. 

A relation which is computed rather than stored (e.g. s, p, s-then.p,j 
above) is called a view in the relational data base terminology. A view 
results from primitive operations on stored relations—also indirectly, via 
other views—and it changes as those relations change (and conversely, a 
change in a view might influence those relations—but this poses quite 
nontrivial problems). The relation mgr.sal, however, can only be ob¬ 
tained by embedding primitive operations in a host programming lan¬ 
guage, furnished with recursion or iteration. An important advantage of 
Prolog is its ability to express tuples, views, and special programs in the 
same language. In particular, it offers a possibility of enforcing integrity 
constraints—application-specific conditions of the coherence of data. 
Constraints should be tested prior to any change to a relation. For exam¬ 
ple, we can use this procedure to insert only correct tuples: 

insert( Tuple) :- 

correct.insert( Tuple ), !, assertz( Tuple ). 
insert( Tuple ) :- signal_violation( Tuple ). 
correct_insert( ’EMP’( E, _, D, _, M ) ) :- 

!, E=\= M, ’DEPT’( D, _ ). % there is such a dept 
correct_insert( _ ). % others are OK 

On the whole, Prolog is a powerful tool for data base applications. 
Admittedly, there is more to data base systems than our presentation 
suggests. For one thing, the size of a real data base may far exceed the 
capacity of any existing Prolog implementation. The model described in 
Chapter 6 ought to be augmented: clauses would be stored on disk and 
handled by standard or specialized access methods. Second, every practi¬ 
cal data base implementation should address problems such as concurrent 
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execution of users’ commands, recovery after hardware failures, etc., 
etc. There are no ready solutions in Prolog but presumably they can be 
programmed into it. 

Surprisingly, Prolog is, in a sense, too strong, too unrestricted. For 
example, to ensure the conformance of a tuple with the relation schema, 
some form of type checking is required, presumably as explicit tests. 
Unrestrained use of assert/retract may also ruin the integrity of a data 
base in other ways. Consequently, Prolog should rather be considered a 
tool for implementing more restricted user interfaces: queries and com¬ 
mands in a user language are analysed (types checked, integrity ensured, 
etc.), and only then translated into Prolog. 

A particularly attractive option would be to query the data base in a 
natural language. Several encouraging small experiments have been car¬ 
ried out. At the moment, though, this is much more a research problem in 
its own right than a generally available programming technique. 

Relational data languages, notably Sequel and Query-by-Example, 
provide syntactic sugar for relation schema definitions, data manipulation 
and queries (involving Boolean expressions and compositions of primitive 
relational operations). In contrast with natural language interfaces, a Pro¬ 
log implementation of a relational data language is a programming task of 
moderate complexity. 

We shall now present Toy-Sequel, a relational data language pat¬ 
terned after Sequel and implemented in Prolog (see Listing 8.4). With the 
exception of aggregate functions, expressions in tuple specifications and 
some exotic features, it supports all that is essential in Sequel. Extensions 
are relatively easy to introduce (we left them out to make the program 
shorter). To give the flavour of the language, here is an annotated conver¬ 
sation with our program, initiated by the call 

;- toysequel. 

To begin with, we specify a few relation schemas: 

create EMP < string name, integer salary, integer dno >. 

create DEPT < integer dno, string manager >. 

create BoardMembers < string name, string position, 

integer seniority >. 

Now we insert some tuples: 

into EMP insert < ’’Brown”, 1000, 1 >, < "White”, 800, I >, 

< ’’Miller”, 850, 1 >, < ’’Barry’’, 900, 2 >, 

< ’’Thomas”, 850, I >, < "Morgan”, 1050, 1 >. 

into DEPT insert < 1, ’’Jones” >, < 2, "Smith" >. 
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We can ask what relations the data base contains; Toy-Sequel dis¬ 
plays their names (its responses are italicized): 

relations. 

BoardMembers 

DEPT 

EMP 

What is the schema of EMP? 

relation EMP. 
string name 
integer salary 
integer dno 

A select expression determines a set of tuples. They may be dis¬ 
played. For example, who in departments other than 2 earns at least 1000? 

select from EMP tuples < name, salary > 
where dno <> 2 and salary >= 1000. 

Brown WOO 
Morgan 1050 

Or they may be inserted elsewhere: 
into EMP insert 

select from DEPT tuples < manager, 1000, dno >. 

In the absence of “where the condition is taken as true. 

Both managers have the salary 1000. We can given Smith a raise: 

update EMP so that salary = 1200 where name = “Smith”. 

Fire Barry: 

from EMP delete tuples where name = "Barry”. 

If several relations are involved, e.g. in a join, attribute names may be 
ambiguous. To disambiguate, qualify them with relation names. For ex¬ 
ample: 

select from EMP, DEPT tuples < name, EMP_dno, manager > 
where EMP.dno = DEPT.dno. 

(Actually, EMP_dno may be replaced with dno: an unqualified attribute 
name is qualified with the leftmost appropriate relation name.) 

A relation may be accessed in several places at once. For example, to 
compare salaries of different employees we need the product of EMP by 
EMP. We must give one of the occurrences an alias name and so allow 
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unambiguous references to attribute values. The following query joins the 
relations EMP, DEPT and EMP alias Mgr, to find employees who earn 
more than their (immediate) manager: 

select from EMP, DEPT, Mgr = EMP tuples < EMP_name > 
where EMP_dno = DEPT_dno 
and DEPT_manager = Mgr_name 
and Mgr_salary < EMP_salary. 

Morgan 

Again, the qualification with EMP is superfluous, as well as the qualifica¬ 
tion of manager. 

A similar condition can be used to give a raise of half the difference in 
salaries to those who earn over 100 less than their manager: 

update EMP using DEPT, Mgr - EMP 

so that salary = EMP_ salary + ( Mgr_ salary - EMP_salary )/2 

where EMP_dno = DEPT_dno 

and manager = Mgr.name 

and Mgr_ salary > EMP_ salary + 100. 

Two miscellaneous queries. Find employees whose names do not 
begin with M: 

select from EMP tuples < name > 

where name < ”M” or name > = ”N”. f 

(five of them). And find EMP tuples with a nonexistent department num¬ 
ber—this is a kind of (manual...) integrity checking: 

select from EMP tuples < name, salary, dno > where 
not < dno > in select from DEPT tuples < dno >. 

in denotes set membership. The name dno in the nested select expression 
pertains to DEPT. 

Time to finish. The relation BoardMembers will not be necessary, 
after all: 

cancel BoardMembers. 

Store the data base in a file: 

dump to AAA. 

(next time we shall begin with 

load from AAA. 
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and resume at this point). Finally, return control to Prolog: 

stop. 

We shall not go into details of the Toy-Sequel interpreter. The ration¬ 
ale for its design was given above; the program is (almost) self-document- 
ing. The following remarks account for a few central technical decisions. 

The main procedure, toy sequel (lines 4-7 in the listing), repeatedly 
reads and executes commands. The procedure getcommand (lines 9-11) 
returns a Prolog goal, which is the translation of a command, and a flag. 
The flag remains uninstantiated if the command is correct, otherwise it is 
instantiated as error. The procedure docommand (lines 13-14) executes a 
correct command's translation, and does nothing in the case of errors. 

A command is processed in three phases. The text, terminated with a 
dot, is read in (lines 32-43) and then passed through a scanner, imple¬ 
mented as a metamorphosis grammar (lines 45-84). It classifies tokens as 
names, strings, integers and single non-alphanumeric characters. A list of 
tokens goes to the command compiler—a metamorphosis grammar which 
is the core of the interpreter. The grammar consists of 11 parts, one for 
each Toy-Sequel command (see lines 113-123). 

All commands, except load and stop, manipulate the relation cata¬ 
logue. The catalogue is implemented as a three-parameter procedure 
'r e T, with a unit clause for each relation schema. A schema stores the 
name of a relation, a generator of this relation’s tuples and a “frame” of 
symbol table entries linking attribute names and types to variables in the 
generator (see lines 131-141). For example, the command 

create EMP< string name, integer salary, integer dno >. 
adds the clause 

r e l’( ’EMP’, ’ EMP’( Name, Salary, Dno ), 

[ attr( name, string, Name ), attr( salary, integer. Salary ), 
attr( dno, integer, Dno ) ] ). 

Blanks are added to relation names in generators to make conflicts with 
other procedures less plausible. 

The command processors for select, insert, delete and update main¬ 
tain a symbol table—a stack of frames taken from the catalogue. For 
example, attribute names in the command 

select from EMP, Mgr = EMP, DEPT 
tuples < name, Mgr_dno, manager > 
where dno = DEPT_dno and manager = Mgr_name 

and salary > Mgr„salary. 
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will be looked for in the following symbol table (see lines 208-223, and 
96-102): 

[ ’EMP’ : [ attr( name, string, NameEMP ), 
attr( salary, integer, SalaryEMP ), 
attr( dno, integer, DnoEMP ) ], 

’Mgr’ : [ attr( name, string, NameMgr ), 
attr( salary, integer, Salary Mgr ), 
attr( dno, integer, DnoMgr ) ], 

’DEPT’ : { attr( dno, integer, DnoDEPT ), 

attr( manager, string, ManagerDEPT ) ] ] 

(Nested select expressions would push their own frames onto this stack— 
see line 277.) 

The product of these three relations will be generated by the following 
calls retrieved from the catalogue: 

’ EMP’( NameEMP, SalaryEMP, DnoEMP ), 

’ EMP’( NameMgr, SalaryMgr, DnoMgr ), 

’ DEPT’( DnoDEPT, ManagerDEPT ) 

The condition will be translated into a Prolog goal (see lines 236-237, 239- 
362). The goal will be executed immediately after the generators (lines 
166-168). Attribute names in the condition will be translated into varia¬ 
bles from the symbol table. Thus, 

salary > Mgr_ salary 

will become 

SalaryEMP > SalaryMgr 
The “equalities” 

DnoEMP = DnoDEPT, ManagerDEPT = NameMgr 

will be processed at compile time, by binding variables together (line 318), 
so that actually only six different variables will occur in the generators. 

The tuple pattern (lines 225-234) will also contain variables from the 
symbol table: 

[ NameEMP, DnoMgr, ManagerDEPT ] 

One such tuple will be displayed in every step of the failure-driven loop 
(lines 166-168). 

A construction that would certainly benefit from a more detailed ex¬ 
planation is update. We shall comment on the example shown in the 
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listing (lines 396-411): 

update EMP using DEPT, Mgr - EMP 

so that salary - salary + ( Mgr_ salary - salary ) / 5 
where salary < Mgr_salary - 1000 and Mgr_name = manager 
and DEPT_dno = dno and not< Mgr_name > in 
select from BoardMembers tuples < name >. 

First, two copies of the stack frame are created, and two call patterns 
(OldTup and NewTup): 

’ EMP’( Name, Salary, Dno ) 

’ EMP'( NewName, NewSalary, NewDno ) 

Now, makemodlist creates a raw modification list: 

[ modif( attr( name, string, Name ), NewName, ModName ), 
modif( attr( salary, integer, Salary ), NewSalary, ModSalary ), 
modif( attr( dno, integer, Dno ), NewDno, ModDno ) ] 

A symbol table is constructed, first a frame for EMP (note that old attrib¬ 
ute values will be retrieved and used), next for DEPT and EMP (with the 
alias name Mgr). During the construction of Modifications (lines 420-421, 
410), the raw list is changed by findmname (lines 454-465, 434): ModSal¬ 
ary is instantiated as true (line 462) to note that the salary will be modi¬ 
fied. Finally, closemodlist (lines 447-452, 422) binds together variables 
that stand for unmodified attributes, i.e, Name with NewName and Dno 
with NewDno. Also, equalities in lines 398-399 cause two other pairs of 
variables to be bound together (see line 318). 

A comment on error treatment. Incorrect data do not terminate pro¬ 
cessing. Instead, the procedure ancestor instantiates the variable Errflag 
(lines 5, 484,488)—this prevents the command from being executed (lines 
13-14), but the analysis continues. The grammar rules synerre (lines 486- 
498) display the troublesome token and the others to its right, and then 
succeed leaving the token list intact. 

In actual use, Toy-Sequel would probably be found too simple. How¬ 
ever, many extensions are quite straightforward. As an exercise, try to 
augment Toy-Sequel with defined views, e.g. 

view EMP1 < name, salary > as 

select from EMP tuples < name, salary > where dno = I. 

Another extension: “wild card” tuple specifications, e.g. 

select from EMP tuples *. 

(i.e. tuples < name, salary, dno > ) 
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select from EMP, DEPT tuples < EMP_*, manager > 
where EMP_dno - DEPT_dno. 

(i.e. tuples < name, salary, EMP_dno, manager > ). 

And aggregate functions, e.g. 

select from EMP average of < salary >. 

An example of less straightforward modifications is query optimisa¬ 
tion. Consider the command: 

select from EMP, DEPT tuples < name, salary > 

where dno = DEPT_dno and manager = ’’Jones”. 

The answer will be generated by the calls 

’ EMP’( Name, Salary, Dno ), ’ DEPT’( Dno, ’’Jones” ) 

which access every EMP tuple, even though only one department is in¬ 
volved. The same set of tuples would be generated by the calls 

’ DEPT’( Dno, ’’Jones” ), ’ EMP’( Name, Salary, Dno ) 

but now other departments’ tuples would never be retrieved. This optimi¬ 
sation could speed things up considerably for Prolog implementations 
with clause indexing. 

Bibliographic Notes 

The bibliography on data bases is enormous. We shall only name a 
few positions relevant to our presentation which (of necessity) has only 
touched on basic facts. Two widely accepted introductory textbooks on 
data bases in general are Ullman (1982) and Date (1982). The relational 
model of data was introduced by Codd (1970) and further elaborated by 
many, including Codd himself (1979). The most popular relational data 
languages are probably Quel, used in the data base system INGRES 
(Stonebraker et al. 1976), Sequel, created for the system R (Astrahan 
1976; Chamberlin et al. 1976) and Query-by-Example (Zloof 1977). 

In the proceedings of conferences on logic in data bases (Gallaire and 
Minker 1978, Gallaire et al. 1981) there are, in particular, papers on the 
role of logic programming in data base theory and applications. The ad¬ 
vantages of Prolog (and logic programming at large) for data bases have 
been advocated by quite a few authors, e.g. Kowalski (1978), Gallaire 
(1983) and Lloyd (1982). A practical demonstration of Prolog’s power is 
ChatSO (Warren and Pereira 1982; Warren 1981), a system with a natural 
language interface. Queries in English are translated into Prolog calls; 
they are similar to those produced by Toy-Sequel, but Chat80 performs 
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some query optimisation. Several other data base applications with natu¬ 
ral language interface are described in Dahl (1977), Coelho (1982), and 
Filgueiras and Pereira (1983). 

Another example of data base application of Prolog is an implementa¬ 
tion of Query-by-Example (Neves and Williams 1983; Neves et al. 1983), 
Chomicki and Grudzinski (1983) describe a system, based on extendible 
hashing, that manipulates tuples stored on disk. The system has been 
designed to support data-base-oriented implementations of Prolog. 

The Toy-Sequel interpreter was rewritten as a sized-down version of 
SPOQUEL, a program which we had written with W-todek Grudzinski in 
early 1982. It helped us through a difficult winter. 







LISTING 8,4 Toy—Sequel interpreter 


1 %. Toy-Sequel interpreter . 

2 % (c) COPYRIGHT 1983 - Feliks Kluzntak, Stanislaw Szpakowicz 

3 % Institute of Informatics, Warsaw University 

4 toysequel write(Toy-Sequel, IIUW Warszawa 1983 ), nl, 

5 repeat, writej h |'), tag( getcommand( Cmd, Errflag) ), 

6 tag{ docommand( Cmd, Errflag )), 

7 Cmd = sequelstop. !. 

8 

9 getcommand( Cmd, Errflag) 

10 readcmd( CmdString), 

11 scan( Cmd St ring, TList ), compile{ TList. Cmd }> 

12 

13 dooommandt Cmd, Errflag) var{ Errflag K l t Cmd, 

14 docommandj , _). 

15 

16 scan( CmdString, TList) > 

17 phrase{ tokenst TList), CmdString ), tracescan( TList). 

18 

19 compile( TList, Cmd ) 

20 phrase (command ( Cmd }, TList), !, tracecompite( Cmd ), 

21 oompile( error } synerr{ badcommand ). 

22 

23 tracescanf Cmd 1 > tracescan, I, write{ '—scanned! Cmd)), nl. 

24 tracescan ( _). 

25 tracecompfe( Cmd) 

26 tracecompite, I, wrile( --compiled! Cmd ) )» nL 

27 tracecompile( _). 

28 

29 tracescan. tracecompile. 

30 

31 % --reader and scanner --- 

32 % Reader stops on the first dot outside strings. 

33 readcmdf String ) ndchskf Ch ), readcmdt Ch, String )♦ 

34 

35 readcmd( 7, Q ) I, rch. 

36 readcrndf [■"* | Resi]) > 

37 1, rdch( Ch readstr( Ch, Rest, RestAfter), 

38 rdch( Nextch ), readcmd( Nextch, RestAfter). 

39 readcmd( Ch, [Ch | Rest]) rdch{ Nextch), readcmdt Nextch, Rest}, 

40 

41 readslrf ‘ ,,, t p* | Rest], Rest) 1. 

42 readstrj Ch, [Ch | Rest], RestAfter) 

43 rdchf Nextch ), readstr( Nextch, Rest, RestAfter 

44 

45 % This scan ner recog ntzes names, st ri ngs, i nte gers. a nd s ingl e 

46 % characters. Strings are returned as lists of characters. 

47 % The tokens are; n(Name), s( St ring), i( Integer). ASingieCharacter 

48 tokens{ [T | Ts]) -> token( T), !, sp, tokenst Is ), 

49 tokenst []) -> Q, 

50 

51 tokent n( Name )) --> 

52 letter( L), namecharst NN ) t { pnamet Name, [L f NN])}, 

53 token( s( String )) -> [ m X stringchars{ String ), 

54 tokent i( Integer J) 

55 sign( S ), diglt( D ), digits( DO ), 

56 { pnameit I, [D | DD] ) t signed{ S, I, Integer) 

57 token( Ch ) [Ch], 

58 

59 tetler( Ch ] --> [Ch], { letter{ Ch ) }. 
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60 

6! namechars( [Ch | Chs]) --> letter! Ch), !, namechars( Chs). 

62 namecharsj [Ch j Chsj) -> digit! Ch), ! t namechars! Chs ). 

63 namecharsj []} ~> ft 

64 

65 % In a string stands for a single ,H1 

66 % (readcmd treats as two adjacent strings). 

67 stringchars( | Chs]} --> [ l f stringchars( Chs). 

68 stringcharsj []} -■> PI, *■ 

69 stringchars( [Ch | Chs]) ~> [Ch] P stringchars! Chs }♦ 

70 


71 

digit! Ch) 

-> 

[Ch], 

{digit! Ch)). 

72 





73 

digits! [D \ 

DD) 

) ■■> 

digit! D ), L digits! CD ). 

74 

digits! 0) 

«> 

n. 


75 





76 

sign( v ) 

**> 

n- 


77 

sign{ V) 

-> 

[V]. 


78 

signf ’+') 


D. 


79 





80 

signed! V 

. 11 

)■ 


81 

signed! *\ 

l f Integer) 

Integer is - L 

82 





83 

sp •*> [’ ’), !, 

sp. 

% optional spaces 

84 

sp -> Q. 





85 


86 

87 

86 

89 

90 

91 

92 

93 

94 

95 

96 

97 

98 

99 

100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 
lit 
112 

113 

114 

115 

116 

117 

118 
119 


% These are used for attributes. 

qname( Qua!-Name ) --> [n( Qual), 'J, n{ Name }] P !. 

qname( Variable-Name ) --> [n( Name)]. % ie fits all qualifiers 

constant! Int h integer J [i( Int )], l 
constant Str P string ) --> [s( Str )]. 

%-- . symbol table operations. 

> op(100 , xfx , 


% Given a relation name and an alias (of select expression, procedure 
% relname). use the relation's schema to push a new set of items onto 
% symbol table stack and to return the generator. The format 
% of a schema is described with create (procedure newrel). 
newrelname( RelNm, Alias, Generator, OldST, [Alias :RelST | OldST]) > 
’r e i f { RelNm, Generator, Re 1ST}, L 
newrelname( RelNm, _, fail, OldST, OldST)synerr(norelname(RelNm)). 

% Given a qualified name, return its assodated variable and type, 
findattrf Q-Nm, Var, Type, [Q : Re 1ST | ST|) 
member( attr( Nm, Type, Var), RelST ), l 
findattr[ QNm, Var, Type, LIST]) ! p findattr(QNm, Var, Type, ST), 
findattrf QNm, _ T []) > synerr{ noattribute( QNm )). 


%.command compiler - - -- 

% See the various commands for examples of use. Command 
% interpretation routines listed alongside command grammar processors. 


command! Cmd ) 
command! Cmd ) 
command! Cmd) 
oommandj Cmd) 
command! Cmd ) 
command! Cmd) 
command! Cmd ) 


create! Cmd )■ 
cancel! Cmd). 
select! Cmd). 
relations! Cmd). 
relation! Cmd). 
insert! Cmd). 
delete! Cmd). 
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120 

121 

122 

123 

124 

125 

126 
127 
120 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 

151 

152 

153 

154 

155 

156 

157 

158 

159 

160 
161 
162 

163 

164 

165 

166 

167 

168 
169 


command! Cmd) 
oommandj Cmd) 
oommandj Cmd) 
oommandj Cmd) 


--> update( Cmd). 
-> stop( Cmd) r 
--> dump( Cmd}. 
k>ad( Cmd), 


% — create a new relation — 

% Eg: create EM P < string name, integer salary, integer dno >. 

% Eg: create DEPTcinteger dno,string manager*. 

% Eg: create BoandMembers<string name, string pos, integer seniority^ 
% Note that lower/upper case matters. Keywords must be 
% in bwer case, otherwise use any convent bn you like, 
create( newrel( RelName, [V | Vs], [attn(Nm, Type, V) | As])) ~> 

[n( create ), n( RelName)], 

1<1. typnam( Type, Nm), typnams( Vs, As ), ['>']. 

typnams( [V | Vs], [attr{ Nm, Type, V) | As]) -> 

{',% l typnam( Type, Nm), typnams( Vs, As ), 
typnams( [], Q ) []* 


typnam{ string, Nm ) -> [n( string), n£ Nm )j, 1. 
typnam{ integer, Nm ) -> [n( integer), n( Nm )], l 
typnamf notype, Nm j --> synerrc( typeexpected ), 


% A schema stores a pattern for invoking the relations tuples (the 
% generator}, and a list of symbol table entries linking attribute 
% names and types with variables in the generator, 
newrelf RelName, Vars, RelST} :- 
not ‘r e l ( RelName, _), J, 
mkgen( RelName, Vars, Generator), 
assertf ’r e l h ( RelName, Generator, RelST)). 
newreif RelName, namerr( dirprelname( RelName )). 


% Add a blank for (rudimentary) security, 
mkgenf RelName, Vars, Generator j 

pname( RelName , Chars ), pname( RelNm, [''[ Chars]), 
Generator [RelNm \ VarsJ. 

% - - - cancel a relation — 

% Eg: cancel EMP. 

cancel( cancel( RelName}) «> [n( cancel), n( RelName)]. 

cancel( RelName } retract( V e P( RelName, Generator, _)), !, 
retract( Generator}, fail. 

cancel( RelName ) > namerr( unknown( RelName}). 


% — queries - - - 

% List the set generated by a selecl expression. 
select( ( Generators, Filter, writetupte{ Tup), fail)) -> 
selectexp( set( Generators, Filter, Tup, _},[]), 


170 wiitetuple( [J) !, nl. 

171 writetupfe( [V | Vs]} > wri( V), write( 1 *), writetuplef Vs). 

172 wri( [X | Y]) !, wrrtetext( [X | Y]), 

173 wri( X ) > write(X). 

174 

175 % List all relatbns. Eg: relations, 

176 relations! ( T e l’( RelNm, _ T _), write( RelNm), nl, fail)) --> 

177 [n( relatbns )]. 

178 % List the attributes of a relation. Eg: relatbn EMP, 

179 relation! relatbn( Name}) «> [n( relalbn ), n( Name}]. 
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180 

181 relatk>n( ReIN )'r e r{ ReIN, Attrs ), I, listattrs( Atirs ), 

182 relatfc>n( ReIN ) > wrrte( ReIN ) T wrrtef is not a relation!’ ), nl, 

183 

184 listattrsf D ) > J- 

185 llstatlrsj [attr( Name. Type. _) | Attrs]) 

186 write( Type ) P write( * T ), writef Name ). nl* 

187 llstattrsf Attrs ). 

188 

189 % — select expression — 

190 % Eg: select from EMP, Mgr=EMP, DEPT tuples < name, dno > 

191 % where salary > Mgr_salary*85/100 

192 % and Mgrna me - manager and DEPT dno = EMP dno 

193 % and < manager > In (<"Smith">, < , 'Jones' > 1 <"Brown">). 

194 % (le get names and department numbers of those subordinates of Smith. 

195 % Jones or Brown who earn more than 85% of their manager's salary). 

196 

197 % Generators pick up tuples Iron named relations, Filters pass only 

198 % tuples fitting the where-clause, Tuple is instantiated to the passed 

199 % tuples (one by one}. Types is the tuple's pattern with types instead 
209 % of attributes (used for type checking). 

201 % The example compiles to : 

202 % set(f EMP’(Name, Salary, Dno), 1 EMP'(MgrName, MgrSalary, MgrDno). 

203 % ‘ DEPT(Dno, MgrName) ), 

204 % (Salary > MgrSa1ary*85/100, true, true, 

205 % (member(MgrName, c'Smith”, "Jones'', "Brown":*), true)), 

206 % [ Name, Dno ], [ string, integer ] ) 

207 

208 selectexp( set( Generators, Filter, Tuple, Types), InitST ) ~> 

209 [n{ select), n( from )], relnames( Generators, InitST, ST ), 

210 [n( tuples)], tupiepattern( Tuple, Types, ST), 

211 whereclause( Filter. ST ), 

212 

213 % One or more relation names, possibly "aliased". Symbol table frag- 

214 % ments are stacked in reverse order, so attribute search order will 

215 % be that of the from-list {using-list for update) relations. 

216 re!names( ( Gen, Gens ), QldST, NewST) -> 

217 reiname( Name, Alias ), [\% l relnames( Gens, Old ST, TempST )* 

218 { newrelname( Name, Alias. Gen, TempST, NewST) }, 

219 relnames( Gen. OldST, NewST) **> relname( Name, Alias}, 

220 { newrelname( Name, Alias, Gen, OldST, NewST)}, 

221 

222 relname( Name, Alias J -> [n( Alias ) t n( Name )], 1 

223 relnamej Name, Name) --> [n( Name )]> 

224 

225 % tuple pattern is also invoked by inexp. 

226 tuplepattem( [A | As], [T | Ts], ST ) --> 

227 [<X attrpatt{ A, T, ST), attrpatts{ As, Ts, ST), ['>]. 

228 

229 attrpatts( [A | As], [T | Ts], ST) -> 

230 H. i aitrpatt{ A, T, ST), attrpatts{ As. Ts, ST ). 

231 attrpaits( W> [], ) -*> []* 

232 

233 attrpatt( Attribute, Type, _) constant Attribute, Type }, I, 

234 attrpatt( A, T, ST ) -> qname( QN ), { findattr( ON, A, T, ST ) }. 

235 

236 wherectause( Filter, ST ) [n(where)], I, boolexp( Filter, ST). 

237 whereclausef true, _) []. 

238 

239 % — Boolean expressions - - * 


242 










LISTING 8.4 {Continued) 


240 % Eg: salary > Mgr salary * 85/100 

241 % or <name> in select from BoardMembers tuples <name> 

242 % Note that embedded select expressbns do not modify the symbol table, 

243 % whose extensions are visible only in the nested constructs. 

244 boolexp( E, ST ) ~> bterm{ T* ST ), rboolexp{ T, E, ST). 

245 

246 rboolexp( L, (L ; R), ST ) -> [n( or )], f, boolexp( R, ST ), 

247 rboolexp( E, E* _} --> fl. 

248 

249 bterm( T, ST) -> bfactorf F, ST ), rtoterm( F P T P ST). 

250 

251 rbtermf L, {L , R) t ST J --> [n( and )], l bterm( R, ST J. 

252 rbterm{ T p T p ) ~> D- 

253 

254 bfactorf not F p ST ) -> [n( ‘not >J» I, bfactor( F. ST ) + 

255 bfactor( E f ST) ['f| p I, boolexp( E, ST). ffl 

256 bfactorf E, ST ) «> inexp( E t ST ), L 

257 bfactorf E, ST ) -> relexp( E, ST ). 

258 

259 % - - ■ set membership - - - 

260 % Eg: < dno, name > in < 1, * Jones" > ± < 2 P “Smith’ > 

261 % Eg: < name > in (select from BoardMembers tuples <name>) 

262 inexp( ( Generator, Filler ) P ST) 

263 tuplepatternf Fait. Type, ST ), [n{ in)], 

264 setexp( setf Generator, Filter, Tuple, Types ), ST ), 

265 matchpattems( Patt, Type, Tuple, Types ). 

266 

267 % matchpattems is a rule, so that synerrc can show context, 

268 matchpatterns( Patt, Types, Patt, Types) --> l 

269 matchpatterns( PI, T1, P2, T2 } --> 

270 synemcf badinexppa!tern( T1, PI, T2, P2)). 

271 

272 % - - - set express bns - - - 

273 % A sequence of tuples or a select expression, possibly in parentheses, 

274 % The generator for a sequence of tuples is a call on member with the 

275 % second parameter instantiated to a list of these tuples. 

276 setexp( Set, ST ) --> [■(■], I, setexp( Set, ST), Hi 

277 setexp( Set, ST) --> selectexp( Set, ST), l 

278 setexp( set( member( Patt, [Tup]Tups]). true, Patt, Types }, ST) 

279 tuple( Tup, Types), tuples( Tups, Types ), 

280 { mkpattern( Types, Patt) }, !. 

281 setexp( self fail, fail, □.□)._) -> synerrc( badsetexpr 

282 

283 tuplesf [Tup | Tups], Types ) -■> [7], !, luple( Tup. TupTypes), 

284 { checktype( Types, TupTypes ) } T tuples( Tups, Types 

285 tuples( Q, _) -> Q. 

286 

287 tuple( [A | As], [T [ Ts]) ~> 

288 [*<1 constant A, T), constants( As, Ts ) t [S% L 

289 tuplef D. []) --> [*<1 synerrc( badtupte ), {fail}, 

290 

291 constant [A | As}, |T | Ts]) -> 

292 [’,1, !, constant! A, T), constants! As. Ts J. 

293 constants! [], []) ~> [J. 

294 

295 checktypef Type, Type) l 

296 checktype( Tl, T2 ) synerr( inconsistent! T1, T2 )}. 

297 

298 % Patt in sel(_ T Patt, J is a list of n fresh variables 

299 % (n is the length of tuples in this set). 
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300 rnkpaitemf [], [J) > !. 

301 mkpatternj [ | Types], [V | Vs] ) mkpattern( Types. Vs), 

302 

303 % - * - relational expressions — 

304 relexp( E, ST ) -> 

305 simplexp( LeftE f LeflType, ST ), refop( Op ), !, 

306 simplexpj RightE, RightType, ST ), 

307 { consrel( LeftE, LeftType, Op, RightE, RightType, E ) }. 

308 

309 retop( W) «> [V, '<']. relop( ■»:■') --> [>■]. 

310 refopj W } r<\vj. relop(V) --> [*<], 

311 rebpj[V, vj. relop( V) ~> ['>% 

312 

313 oonsrel( U Type, Op, R, Type, E ) > consrel( L, Op, R, Type, E ), L 

314 consrelj L LType, Op, R, RType, fail );- 

315 E =.. [Op, L, R], synerref typeconflict( LType, RType, E )}. 

316 

317 % The first clause does compile-time equality. 

318 consre1( Arg, V-\ Arg, _, true ). 

319 consrelj L, V»\ R, string, fail), 

320 consrelj L, w, R, string, not l = R ). 

321 consrelj L, Op, R, integer, E ) :- E «,♦ [Op, L, R], 

322 consrelj L, *<\ R, string, Istrf L, R ) ). 

323 consrelj L, ■< R, string. ( lstr( L, R ) ; L « R ) ), 

324 consrelj L, V, R, string, lslr{ R, L )), 

325 consrelf L, W, R, string, ( tetr( R, L} ; R = L)), 

326 

327 % Compare strings lexicographically, 

328 lstr( 0i L I J ) J- 

329 lstr( [Chi | J. [Ch2 | J ) :- Chi @< Ch2, L 

330 1str( [Chi | Chsl], [Chi | Chs2]) > lstr( Chsl, Chs2 }. 

331 

332 % - - - simple expressions — 

333 simplexpt E. string, ST) -> stringexp( E, ST), !. 

334 simplexpj E, integer, ST) -> artlhexp( E, ST). 

335 

336 strlngexp( Str, _ ) -> [s( Str)], !. 

337 % Type checking delayed to avoid error messages - might be integer, 

338 stringexp{ Var, ST) ~> 

339 qname( ON ), {tindattr( QN. Var, Type, ST }, Type = siring }. 

340 

341 arithexp( E, ST ) --> aterm( T. ST ). rarithexpf T, E, ST ), 

342 

343 rarithexp[ L, E, ST ) -> 

344 [VJ, I, atermf T, ST ), rarithexp( L+T, E, ST), 

345 rarithexp( L, E, ST ) -> 

346 M, 1, aterm(T, ST), rarithexp( L-T, E* ST ), 

347 ra rithe xp( E, E t ) -> []. 

348 

349 aterm( T, ST ) ~> afactor( F, ST ), raterm( F, T, ST ). 

350 

351 raterm( l t T, ST } -> 

352 [ *■). I. afactor( F, ST), raterm( L*F, T, ST), 

353 ratermf L. T, ST ) -> 

354 [7], !, afactor( F, ST ), raterm{ L/F, T, ST ). 

355 raterm( T, T. ) ~> [], 

356 

357 afactorf E } ST ) --> ['(% !, arilhexp( E, ST ), [)l 

358 atactor[ tnt, _) ~> [i( Irrt)], !. 

359 afactorj Var, _ ST ) --> 
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360 qnamef QN), {findattrf QN, Var, Type, ST), Type = integer }, t. 

361 afactor( 0, _) --> qname( QN ), !, synerrc( notlntegerf QN )). 

362 afactor( 0, _) -> synerrcf no tntegerf actor). 

363 

364 % - — insert * - - 

365 % Eg: into EMP insert <T Jones",1000,1 >, <"Smith",1200,2x 

366 % Eg; into EMP insert select from DEPT tuples <manager, 1050, dnox 

367 insertf ( Generators, Filter, assertzf NewTuple), fail)) --> 

368 [n( into ), nf Re!Name)], 

369 {'re f'( RelName, RelST) }, I, [n{ insert)], 

370 setexpf set( Generators, Filler, Tuple, Types ), Q), 

371 { checktypesf Types, RelST), 

372 mkgen( RelName, Tuple, NewTuple )}. 

373 inserif fail) ~> [n( into ), n{ RelNm )], 

374 synerrcf norelname( RelNm ) ) T 

375 

376 checktypesf [], []) l 

377 checktypesf [TfTsJ, [attrf T, _ )|As]}1, checktypesf T$, As ). 

378 checktypesf Types, Atlrs] synerrf badsettypef Types, Attrs )). 

379 

380 % - - - delete - - - 

381 % Eg: from EMP delete all tuples. 

382 % Eg: from EMP delete tuples where salary < 1000 and 

383 % <dno> in select from DEPT tuples <dno> 

384 % where manager«"Smith 1 '. 

385 % (ie lire all subordinates of Smith who earn less than a 1000 ) 

386 deletef ( RelGen, RelFitler, relractf RelGen), fail)) ~> 

387 [n( from), n( RelNm)], 

388 { newrelnamef RelNm, RelNm, RelGen, [], ST ) }, 

389 (n( delete )], delfilterf RelFUter, ST). 

390 

391 delfilterf t rue, _ } -> [n( all}, n( tuples)], !. 

392 delfilterf RelFilter, ST ) 

393 [n( tuples ), nf where )], boolexpf RelFilter. ST ). 

394 

395 % - - - update - - - 

396 % Eg: update EMP using DEPT, Mgr=EMP 

397 % so that salary = salary + fMgr salary - salary)/5 

398 % where salary < Mg ^salary - 1000 and Mgrname = manager 

399 % and DEPT dno = dno 

400 % and not <Mgr name> in 

401 % select from BoardMembers tuples <name>. 

402 % fie to all employees who earn over a 1000 less than their manager 

403 % give a raise equal to 20% of the difference, provided the manager 

404 % does not sit on the board) 

405 % This is compiled to : 

406 % ' EMP'f Name, Sal, Dno ) t % OldTup 

407 % ( 5 DEPT'fDno,Manager), ' EM P’[Manager, MgrSa I .MgrDno)), % UseGens 

408 % (Sal < MgrSal -1000. true, true, 

409 % not [' BoardMembers^Manager,true)), % Filter 

410 % NewSal is Sal + (MgrSal - Sa!)/5, % Modifications 

411 % retractf' EMP'fName.Sal.Dno^^serK 1 EMP fName.NewSakDno}).fail 

412 updatef ( OldTup, UseGens, Filter, Modifications, 

413 retractf OldTup ), assertf NewTup). fail)) -> 

414 [n( update ), nf RelNm )), 

415 { ‘r e V{ RelNm, OldTup, OldST), 

416 're l*( RelNm, NewTup, NewST}, 1, 

417 makemodlfstf OldST, NewST, MList) }, 

418 usingcfausef UseGens, Use ST), { ST - [RelNm : OldST J UseST] }, 

419 [nf so ), nf that)], 
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420 modified Modification, MList, ST ), 

421 modifiers! Modification, Modifications, MList, ST ), 

422 { cbsemodlist! MList)}, whereclause( Filter, ST). 

423 update! fail) ~> [n( update )], synerrc( noupdatedrelation }. 

424 

425 usingclause( Gens, ST ) -> fn( using )];, relnames! Gens, [], ST ) r 

426 usingclause! true, ST } --> [], 

427 

428 modifiers! M, (M , Ms), MUst, ST) «> 

429 H. L modifier! MM, MList, ST), 

430 modifiers! MM, Ms, MList, ST ). 

431 modifiers! M, M, , ) -> Q, 

432 

433 modifier! AttrVar is Expr t MList, ST ) -> 

434 [n( Nm)], {findmname{ Nm, AttrVar, Type, MList)}, 

435 [V], simplexp< Expr, EType. ST), 

436 { mtype( Type, EType, Nm)}, 

437 

438 % A "mod list" lists updated relation’s attributes together with new 

439 % variables forming new tuple’s attributes and Mod variables which are 

440 % used to flag an attribute’s modification when it is detected on the 

441 % left hand side of an equality in "so that"-list, 

442 makemodiist! [Old | Olds], [attr( , NewV) | NewVs], 

443 [modif( Old, NewV, Mod ) I Mods] > 

444 !, makemodiist! Olds, NewVs, Mods ). 

445 make mod list! [], [], []). 

446 

447 % Bind old and new variables in 'modlisr entries with clear flag. 

448 closemodlist( [Mod | Mods]) 

449 closemod( Mod ), !, closemodlist( Mods). 

450 closemodlist! []). 

451 close mod( modif{ attr( _ t OldV ), OkJV, Mod ) ) > var( Mod ). 

452 closemodj _ )■ 

453 

454 % Flag an updated attribute in "modHst". 

455 findmname( Nm, NewV, T, MList) > 

456 member! modiff attr( Nm, T, _ ), NewV, Mod), MUst), I, 

457 mmod( Mod, Nm), 

458 findmname! Nm, , _, _) synerrf nolinupdatedrelf Nm ) ), 

459 

460 % if no errors, the first clause of mmod fails, the second binds, 

461 mmod( Mod, Nm) not var( Mod ), I, synerr( updatedtwice! Nm )). 

462 mmodf true, _ ). 

463 

464 mlype( Type, Type, _) !. 

465 mtypej T1, T2 t Nm ) > synerr! 1ypeconflict( T1, Nm, T2)). 

466 

467 % - - - control commands - - - 

468 stop( sequetstop } [n( stop )). 

469 

470 sequelstop, % do nothing (cf the main procedure} 

471 

472 load! consult! FileName)} -> [n( load}, n( from), n( FileName )], 

473 

474 dump! dump! FileName )) -> [n( dump ), n! to ), n( FileName }]* 

475 

476 dump! FileName) tell( FileName )* 

477 ‘r e P( Nm. Gen, ST ), wclause! r e l’( Nm, Gen, ST)), 

478 Gen, wclause! Gen }> fail. 

479 dump! _) write! ’end/), nl, told . 
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480 

481 wdause{ Cl} > writeq( Clwrite( 7 ), nl. 

482 

483 % * --error handling fbare bones" version) — — 

484 synerr( Into ) > synmes( Info )* ancestorf getcommand( * error) ). 

485 

486 synerrc( Info ) -> { synmes( Into ), write( Context: ‘ 

487 context, % will fail eventually! 

488 synerrc( _) «> { nl, ancestor getcommand( , error) ) } + 

489 

490 synmes(lnfo) > nl P write('-~ Syntactic error:write(lnfo), nl. 

491 

492 context ~> [Token], {wtoken( Token )}, context. 

493 

494 wtoken( T) wt( T, RealT), wrrte( RealT), write(" 1 ), !. 

495 wt[ n{ Name ) P Name ). 

496 wt[ f( Integer), Integer). 

497 wt[ s( String ), String). 

498 wt[ Char, Char 

499 

500 namerr{ Info) nl t write( 14 ■* Error:'), nl, 

501 write(Info )* nl, tagfail( docommand( 


247 












9 PROLOG DIALECTS' 


9.1. PROLOG I 


The idea of logic programming emerged in Marseilles in the first half 
of 1972 while Robert Kowalski was visiting the artificial intelligence team 
founded by Alain Colmerauer at the University of Marseilles. Colmerauer 
with his team prepared the design specification of the programming lan¬ 
guage Prolog (Colmerauer et al. 1972). The language resembled a theorem 
prover rather closely, but it already possessed the essential properties of 
contemporary Prolog, and even some features reintroduced quite re¬ 
cently, e.g. delaying calls till appropriate instantiation of their arguments. 
Almost at the same time Kowalski advocated predicate calculus as a 
formalism for expressing algorithms without commitment to a specific 
strategy of their execution; the short note (1972) was later expanded to a 
larger paper (1974). Hence the two pioneers of logic programming took 
from the outset different approaches to the problem of changing the idea 
into reality, Although developed in close interaction, these different atti¬ 
tudes still manifest themselves in logic programming research. 

The language described in Colmerauer et al. (1972) was implemented 
in Algol W on IBM 360/67 by Philippe Roussel and used at once in several 
applications (Colmerauer et al. 1973, Pasero 1973, Kanoui 1973, Joubert 
1974; Bergman and Kanoui 1973, 1975, Battani and Meloni 1975, Guizo! 
1975). It was quickly replaced by an improved version, coded partly in 
Fortran by Battani and Meloni (1973) and partly in Prolog by Colmerauer 
and Roussel. This was the first version of Prolog used outside Marseilles. 
Although not christened so by its authors, it deserves the name of Prolog 
I, especially as its commonly used name “Marseille Prolog” has become 
ambiguous. 

1 This chapter was contributed by Janusz S. Bieri, Institute of Informatics, Warsaw 
University, Warsaw, Poland. 
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The original reference to Prolog 1 is Roussel (1975); some historical 
information can be found in Battani and Mdloni (1973) and Kluiniak 
(1984). The syntax of the language is illustrated below by the sample 
clauses: 

+APPEND( NIL, *X, *X ). 

+APPEND( *X.*Y, *Z, *X.*V ) -APPEND( *Y, *Z, *V ). 

This should be preceded by the declaration of the infix dot: 

-AJOP( 1, ”X’( XX )” )! 

And a sample directive (SORT means write): 

-APPEND( A.B.NIL, C.NIL, *X ) -SORT( *X ) -LIGNE! 

Positive literals (see Chapter 2) are preceded with +, negative with -; the 
notation allows representation of non-Horn clauses. This was a natural 
requirement, because the early versions of Prolog were intended to imple¬ 
ment a general theorem-proving method known as linear resolution with 
selection function (Kowalski and Kuehner 1971). Program clauses were 
distinguished from directives by a different terminator. The syntax sur¬ 
vived its original motivation and is still used in some versions of the 
language (Kluzniak and Szpakowicz 1983). 


9.2. PROLOG II 

After Prolog I was released, Colmerauer’s team experimented with 
various mutations of the language; some of them have been described in 
Guizol and Mdloni (1976), Colmerauer et al. (1979) and Kanoui and Van 
Caneghem (1980). Finally it was announced that the goal of creating “the 
ultimate Prolog” was achieved (Colmerauer et al. 1981). The new lan¬ 
guage was called Prolog II by its authors (Colmerauer 1982, Van 
Caneghem 1982, Kanoui 1982). 

The most important innovation of Prolog II is the treatment of cyclic 
data structures (see Section 1.2,3). They are simply valid representations 
of infinite trees, which can be manipulated in a similar way to other terms 
(Colmerauer 1979, 1982). However, the same infinite tree can be repre¬ 
sented by different data structures; to let them be matched correctly it 
appeared necessary to treat functors in the same way as arguments (Fil- 
gueiras 1982). As a result, a functor can be a variable or a compound term. 
In Prolog II, the standard form of terms is considered just a shorthand 
notation for a more general form called tuple. For example, ff(x) stands 
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for <ff, x>, while <x, y> and <<ff(x)>, y> are also legal terms (single¬ 
letter names denote variables, ff is a constant). Instead of unifying two 
tuples, Prolog II constructs a system of equations. For example, matching 
<x> with <ff(x)> corresponds to solving in x the equation 

x * ff( x ). 

The solution of this equation is the infinite tree fffff(ff(...))), which is 
represented by an appropriate cyclic data structure. 

The behavior of Prolog programs is described as incremental solving 
of the system of equations introduced by program clauses, which can also 
be seen as rewriting rules. The execution of a call is viewed as the opera¬ 
tion of erasing it by applying the rules and solving the appropriate equa¬ 
tions. This viewpoint manifests itself in the syntax of clauses by an arrow 
leading from the head to the (possibly empty) body, e.g. 

appendf nil, x, x ) -> ; 

append( x.y, z, x.v ) —► append( y, z, v ) ; 

This approach makes it possible to describe the principles of Prolog II in a 
compact and self-contained way, relieved from the references to theorem¬ 
proving techniques and relying only on the most fundamental and intui¬ 
tive notions of logic (Colmerauer 1983). 

Prolog II offers a simple yet powerful coroutining mechanism (see 
Chapter 2). A call may require its parameter to be instantiated. Given a 
variable, it waits for it to become bound. This is achieved by the built-in 
procedure geler (“freeze”), whose first argument is a “trigger” (usually a 
variable to be bound) and the second a call (to be delayed). If the “trig¬ 
ger” is already bound, geler simply executes the call. 

Another coroutining primitive is dif, which succeeds if its parameters 
are not “perfectly” equal (see the built-in procedure = =, Section 5.6). 
If during the execution of dif (“down the terms’ structure”) a variable is 
encountered, dif waits until it becomes bound and only then resumes the 
comparison. 

The coroutining mechanism will be illustrated by two examples 
adapted from Colmerauer et at. (1983). The first example is a procedure 
that takes two trees represented as deeply nested dotted list structures. It 
succeeds when both structures can be flattened to the same linear list. A 
built-in procedure idem is used to check whether the argument is a 
constant. 

sameleaves( a, b ) -» leaves( a, u ) leaves( b, u ) list( u ); 

leaves! a, u ) —► geler( u, leaves 1( a, u )); 
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leavesl( a, a,nil) -»> ident( a ); 

leaves 1( a.l, a.u ) -» ident( a ) leaves( 1, u ); 

leavesl( ( a.b ).l, u ) -> leaves I ( a.b.l, u ); 

list( nil) —»; 

listf a.u ) -* list( u ); 

The second example is a procedure that generates a list of digits I, 2, 3 
such that all three elements are different. 

perm( x.y.z.nil ) -> alldifferent( x.y.z.nil ) 
alldigitsf x.y.z.nil); 

alldigit s( nil ) 

alldigits( x.I) —*■ digit( x ) alldigitsf 1); 
digit( 1 ) -»; digit( 2 ) -»; digit( 3 ) 

alldifferent( nil) — 

alldifferentf x.I) -* outside( x, 1) alldifferent( 1); 

outside( x, nil); 

outside( x, y.l) -*■ dif( x, y ) outsidef x, I); 

Prolog II supports a kind of modularisation implemented by so-called 
“worlds”, each with a unique name. Worlds are organised into a tree 
structure. The root is the world “origine”, which has two subworlds 
“ordinaire” and “?????”. Subworlds of “ordinaire” (which is the de¬ 
fault) can be created by the user who can walk up and down the tree, and 
also create and discard worlds. “?????” contains the Prolog II supervisor 
and cannot be used as the current world. Every procedure name is associ¬ 
ated with the world in which it was first mentioned (“declared”). It is 
accessible in this world and its descendants but not in its siblings. More¬ 
over, a name N and the same name N declared in a superworld later on 
refer to different procedures. 

Clauses are available for all manipulations (including initial definition) 
only in the world where the procedure name has been declared and in its 
direct subworlds. For example, standard procedure names are introduced 
in “origine”—with clauses defined in “?????”—and used in “ordi¬ 
naire”. So, the user cannot change a standard procedure definition. 
Clause indexing is provided, or rather tuple indexing. The leftmost name 
in the tuple is used as a key. 

The purpose of Toy’s lag, tagexit etc. (see Section 5.12) is served in 
Prolog II by a pair of built-in procedures bloc, fin-bloc. In the call blocfl, 
t), l is the call to be executed in the block. When, during the execution of 
t, a call of the form fin-bloc(ll} is encountered, the most recent call on 
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hloc(l, t) with / unifiable with II is sought, if none is found, an error 
condition is raised, else this call on bloc succeeds deterministically. This 
feature is used for error handling and for exiting loops. 

We have presented here only some of the Prolog II features, and the 
interested reader is referred to Colmerauer et al. (1983). It is interesting 
that the pilot implementation of Prolog II which is described here was 
done on an Apple II microcomputer using software paging on floppy 
disks. 


9.3. MICRO-PROLOG AND MPROLOG 


micro-Prolog is the dialect used by Kowalski’s team at Imperial Col¬ 
lege of Science and Technology in London. micro-Prolog was developed 
and implemented by McCabe (1981); his main goal was to install Prologon 
a cheap 8-bit microcomputer. micro-Prolog uses lists to represent terms, 
calls and clauses, e.g. 

(( AppendQ x x ) ) 

({ Append (x|X)Y(x|Z))( Append X Y Z )) 

The list notation has several advantages: predicates and functors need not 
have a fixed number of arguments (i.e, the whole argument list can be 
bound to a single variable); their names can be arbitrary list structures (for 
technical reasons this is not allowed for predicates), micro-Prolog also 
supports a simple form of modularity: modules are created dynamically 
and the accessibility of names is determined by export/import lists. 

A typical user is not expected to interact directly with micro-Prolog, 
because a special front-end called Simple is provided to conceal the low- 
level language features. Here is the append example in the Simple syntax: 

Append( () x x ) 

Appendf (x|X)Y(x|Z))if Append( X Y Z ) 

Simple was used as a computer language for children; other interesting 
applications are expert systems. The language is subject to various exper¬ 
iments and extensions, such as an explanation facility or an original form 
of input/output operations called query-the-user (Sergot 1982); Simple 
and other extensions are all written in micro-Prolog. Both micro-Prolog 
and some of its applications are extensively documented in Ennals (1983) 
and Clark and McCabe (1984). 

MPROLOG was developed at the Institute for Co-ordination of Com¬ 
puter Techniques, in Budapest (Bend! et al. 1980, SzKI 1982), using the 
programming language CDL2 (Koster 1974). MPROLOG is an upward- 
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compatible extension of Prolog-10, intended for creating production soft¬ 
ware for mainframe computers. The crucial extension consists in intro¬ 
ducing a form of modularity, based on the ideas of Szeredi (1982) and 
similar in spirit to that found in many other languages. The modules are 
syntactic units and contain explicit export/import lists determining the 
visibility (i.e. the accessibility) of names; a visible name can serve as a 
functor or as a predicate name. When a program is entered, only the main 
module is loaded and executed; other modules must be loaded explicitly 
by calling appropriate built-in procedures. 

MPROLOG is a large system which includes several components. 
The pretranslator produces an internal form of a program module; the 
consolidator links the modules into a program; the interpreter executes it. 
The program development support system (PDSS) provides a dedicated 
editor and debugging aids. A compiler and an optimizer are under devel¬ 
opment. The language offers a multitude of built-in procedures (probably 
more than any other Prolog system) and interfaces to user-supplied proce¬ 
dures written in CDL2 and Fortran. 
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APPENDIX A.l 
Kernel File 


1 % KERNEL file 

2 % standard atoms 

3 ’;72 \72 , ca!l71 YagVI 

4 □70 \72 ’error71 userYO 

5 

6 % atoms identifying system routines (keep lair first and "true' last) 

7 'fail7Q tagVI 'calm T/O 

8 'tagcutYI lagfailVI 'tagexitYI 'ancestor’/l 

9 'haltYI 'statusYO 

10 ’display71 ’rch70 lastchVI 'skipblYO ’wchYl 

11 ’echo70 ’noechoVO 

12 ’see7l ’seeingVI 'seenVO ’tell’/l telling’/l ’toldVO 

13 s ordchr72 'sum73 *prod74 1ess72 ’#<72 

14 smalletterVI ’bigletterYI ’letter'/l 'digitVI ’alphanumYl 

15 bracketYI 'solocharYI 'symchVI 

16 'eqvar'/2 Var'/I 

17 atomVI ’integer’/l 'nonvarintYI 

18 ’functor73 ’arg73 pname72 pnamei72 

19 ’SprocVI 'SprodimitYO 'SprocinitYO 

20 ’clause'/S ’retract73 'abolish72 S assert73 Yedefine'/O 

21 predefined72 ’protect’/O 

22 ’nonexistentYO ’nononexistentYO 

23 debug70 'nodebug70 

24 'true 70 

25 

26 % kernel library 

27 error(:0) : nl, display('+++ System call error: ’). display(:0) , 

28 nl. fall. Q 

29 :ordchr(10* :Q) * assert (iseo In (:0), [], 0). 

30 assert(nf t wch(:0).[]. 0) □ # 


31 

-{:0, :0; 

ID 


32 

7(:0, :1) 

: call(:0) 

, call(:1). □ 

33 

'I'CO.J 

: call(:0). 

D 

34 

:0> 

: call(:0) . 

n 

35 

not(:0) : 

call(:0). 

T . fail. n 


36 not(J : [] 

37 check(:0) : not(not(:0)) . (] 

38 ’side_etfects'(:0) : not(not(:0)), Q 

39 

40 once(:0) : call(:0) . T , B 

41 

42 ’@»c-(:0 ( :1) : #<'(:1, :0}. T . fail. [] 

43 ’@*<*(_> J : 0 

44 W(:0, :1 ):#<’( :1,:0).B 

45 #>-’(;0, :1) : ’@=<’(:t, :0}. Q 

46 

47 %.basic input procedures.- 

48 rdchsk(:G) : rch . skipbl. lastch(;0) . B 

49 rdch(:0) : rch . Iastch(;1) * sch(;1, :0) . 0 

50 % convert nonprintable characters to blanks 

51 sch(:0, :0) : #<'(’ \ :0) 

52 sch(:0, ”) : 0 

53 
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54 repeat ; [] 

55 repeat: repeat ♦ [] 

56 member(:Q, :0.:1) : [J 

57 member(:0, _.;1) : member(:Op :1) . [] 

58 

59 proc(:0) : ^procinit*. P $pr f {:0) . [] 

60 , $pr'(:0) : 'Jproclimif . T . fail, Q 

61 h $pr'{:0) : '$proc(:0}. [] 

62 $pr'(:0) : '$pr'(:0) . [] 

63 

64 % b a g o f (preserves order of solutions) 

65 bagof(:0 P :1, J : assertaCBAGTBAG')). cal!(;1) . 

66 assertaCBAG^O)) .fail. [] 

67 %% 0 Item, 1 Condition 

68 bagof(_ 5 :0) : ’BAG*(:1) . T . intobag(:1, D, :0) . [] 

69 %% 0 Bag, 1 Item, 

70 intobagfBAG', :0. :0) : T . retract (SAG'. 1 t 1) . Q 

71 %% 0 Final bag, 

72 intobag{:0, :1, 2) : retract( s BAG\ 1,1), *BAG'(:3). T . 

73 intobag(:3 t :0.;1 S 2) . [J 

74 %% o Item, 1 Thisjbag, 2 Final bag, 3 Next item, 

75 " 

76 % end of file - toyprolog will now read from the terminal 

77 : display(Kemel fie loaded.') nl. see(user) . [] # 
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APPENDIX A*2 
“Bootstripper” 


1 % % % translator of Prolog- 1Q(mini) into "kernel-prolog M % % % 

2 translate! :0 P :1) : see(:G) . tell(:1) . program . seen . told . 

3 see(user). tell(user) . display(translated(:D)) . nl . Q 

4 %% 0 fromjile, 1 tofile 

5 .--- 

6 % main loop 

7 program : rch . skpb(:0). tag{transl(;0)). isendsym{:0) . T . Q 

8 program : program , [] 

9 transit'®! : ? * rch . Q 

10 transl{'%'): comment(’%\ :0, Q) . T . puttr(:0) . Q 

11 transit:0) : dause(:0 P :1 P [], :2) . puttr(:1). putvarnames(:2 P 0).Q 

12 %% 0 startch, 1 termrepr, 2 symjab 

13 isendsym('@'}: Q % otherwise fail, ie loop 

14 ... 

15 % error handling: skip to the nearest dot 

16 enr(:0 P :1) : display^** error in ’) . display(:0). 

17 display!*; unexpected "’). display! :1) . Iastch(:2), 

18 display!" . text skipped:') . skip(:2) , nl . tagfail(tran$l(J).[) 

19 %% 0 procname, 1 bad item, 2 first skfpped char 

20 skip!'/) : wch!?) I fj 

21 skip(:0): wch(:0), rch . Iastch(:1) * skip(:1). [] 

22 %.... 

23 % a comment extends till end of line 

24 comment!:0, :Q,'A, :1) : iseoln(:0) . [] 

25 %% 0 eoln, 1 rest of termrepr 

26 comment!:!), :Q.:1 P :2) : rch . !a$tch(:3) . comment(:3 P :1 5 :2) . [] 

27 %% 0 char, 1 termrepr, 2 rest of termrepr, 3 nextchar 

28 %.. ---- 

29 % read a goal 

30 clause!?, 7:0, :1, :2) : T . ctail!?. :0,' V@\:1, :2) . [] 

31 %% 0 fermrepr, 1 rest of termrepr, 2 symjab 

32 % read an assertion/rule 

33 clausef :Q. :1, 2, :3) : fterm(:0, :4 P :1, 1 V:'.:5 P :3) . 

34 T . ctail(:4, :5, :2, :3) . Q 

35 %% 0 ftermjirstch, 1 termrepr, 2 restoftermrepr, 

36 %% 3 symjab, 4 ctailJirstch, 5 middletermrepr 

37 clause(:0 P err(clause P :0) . Q 

38 .. 

39 % clausa tail 

40 ctail{7, 1 7[\T*30» :0* J : T , fl 

41 %% 0 rest_of Jermrepr 

42 % righthard side of a non-unit clause, or a goal 

43 % eoln and blanks inserted to make the output look tidy 

44 ctaiit?, :4/ V 7 70, :1 p :2) : rdch(V) . T , iseoln{:4) . 

45 rdchsk(:3). ctailaux!:3, :0, :1, :2). [] 

46 %% 0 termrepr, 1 rest_of Jermrepr, 2 symjab,3 calls firstch, 

47 %% 4 eoln 

48 ctail(:Q, J : err(ctail, :0) , [] 

49 % get the righthand side of a clause (embedded comments not displaced) 

50 ctailauxC%', :0, :1, :2) : comment( s %', :0, 1 '/ 7 \:5) . T . 

51 rdchsk{:3) . ctailaux(:3 p :5 P :1 P :2) . [] 

52 %% 0 termrepr, 1 rest of termrepr, 2 symjab, 3 restjirstch, 
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53 %% 5 middletermrepr 

54 ctailaux{:0. :1, :2, :3) : fterm(:0. :4, :1, 1 V.\:5, :3}, 

55 fterms(:4, :5, :2 ( :3) . Q 

56 %% 0 ftermjirstch, 1 termrepr, 2 rest of termrepr, 

57 %% 3 symtab, 4 fterms Jirstch, 5 middletermrepr 

58 % a list of functor-terms (ie calls) 

59 fterms(7, 1 7[V]\:0 ( :0, J : T . Q 

60 %% 0 rest j>fJermrepr 

61 % eoln and blanks - cf ctail/2/ 

62 fterms(\\ :4.’ 7 7 \:Q, :1, :2) : T . iseoln(:4) . 

63 rdchsk{:3) . ctailaux(;3, :0. :1, :2) . [] 

54 %% o termrepr, 1 resl_of termrepr, 2 sym tab,3 ctail firstch, 

55 %% 4 eoln " 

66 fterm$(:0, , ,J: enrffterms, :0) . n 

67 .. 

68 % a functor-term 

69 fterm(;0, :1, '“2, :3 P :4) : 

70 «dent(:Q, ;5 P :2, “76). T . args(:5, :1, :6, :3 t A) , Q 

71 %% 0 Id firstch, 1 terminator, 2 termrepr.3 rest of termrepr, 

72 %% 4 sym tab, 5 id terminator, 6 middletermrepr - 

73 % identifiers: words, !, quoted names, symbols 

74 ident(;0, :1 p :0.:2, :3) : 

75 wordstart{:0) . rdch(:4) . alphanums(:4, :1, :2, :3) . [] 

76 %% 0 idfirstch, 1 terminator, 2 termrepr, 

77 %% 3 restj>f Jermrepr, 4 nextch 

78 ident( !\ :0, 171, :1) : rch . skpb(:G). □ 

79 %% 0 terminator, 1 termrepr 

80 fctentp, :Q, :1, :2) : rdch(:3). qident(3, :0, :1, :2) , [J 

81 %% o terminator, 1 termrepr, 2 rest of termrepr, 3 nextch 

82 ident(:0, :1, :0.:2, :3) : 

83 symdi(:0) . rdch(:4) . symbol(:4, :1, ;2, :3] . [] 

84 %% 0 symbjirstch, 1 terminator, 2 termrepr, 

85 %% 3 restof Jermrepr, 4 nextch 

86 % quoted identifiers 

87 qHentr". :0, :1 P 2): 

88 rdch(:3), qidentail(:3, :0, :1, :2) . T . [] 

89 %% o terminator, 1 termrepr, 2 restj)f Jermrepr, 3 nextch 

90 qident(:0, :1, :Q.;2, :3) : rdch{:4) . qident(:4, :1, :2, :3>. [] 

91 %% o char, 1 terminator, 2 termrepr, 

92 %% 3 restof Jermrepr, 4 nextch 

93 qktentalir 1 , :0, "".™ :1 ( :2) : 

94 rdch(:3) . qident(:3, :0, :1, :2> . [] 

95 %% o terminator, 1 termrepr, 2 restof Jermrepr, 3 nextch 

96 qidentailC, :G, :t, :1) : skpb(:0) . [] 

97 %% o terminator, 1 rest_of Jermrepr 

98 % words and symbols 

99 alphanums{:0, ;1, :0.:2, :3) : 

100 alphanum(:0) ♦ T. rdch(:4) . alphanums(:4, :1, :2, :3), [] 

101 %% 0 an_alphanum, 1 terminator, 2 termrepr, 

102 %% 3 restof Jermrepr, 4 nextch 

103 alphanums(_, :0, :1, :1) : skpb(:0). 0 

104 %% o terminator, 1 rest j>fJermrepr 
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105 syrnbolfrO, :1, :0.:2, :3) ; 

106 symch(:0) . T . rdch(:4) + symbol{:4 } :1, :2, :3) * [] 

107 %% 0 a^symbolchar, 1 terminator, 2 termrepr, 

108 %% 3 rest_ofJermrepr, 4 nextch 

109 symbol(_, :0, :1, :1) : skpb{:0). Q 

110 %% 0 terminator, 1 restoftermrepr 

111 % get argument list: nothing or a sequence of terms in brackets 

112 argsf(\ iC/C-il, :2, :3) : 

113 T * rdchsk{:4) . terms(:4 t :1, :2, :3) . rdchsk(:0) . [] 

114 %% 0 nextch, 1 termrepr, 2 rest of termrepr, 

115 %% 3 symJab, 4 terms firstch 

116 args(:0, :0, :1, :1, J : [] 

117 %% 0 nextch, 1 restof Jermrepr 

118 % get a sequence of terms 

119 terms(:0, :1, :2, :3) : termftO, :4, :1, :5, inargs, :3) . 

120 termstaiK:4 ? :5, :2, :3) . {\ 

121 %% 0 term firstch, 1 termrepr, 2 rest_ofJermrepr t 3 sym tab, 

122 %% 4 terminator, 5 middletermrepr 

123 termstailO\ , )\:0 1 :0 ( J : T . [] 

124 %% 0 rest_of Jermrepr 

125 lermstait(y, \7 \:Q, :1, :2) : 

126 T . rdchsk(:3) . terms(:3, :0, :1, :2) . D 

127 %% 0 middletermrepr, 1 restjjfJermrepr, 2 sym Jab, 3 nextch 

128 termstail(:0, J : err(termstail, :0) . [] 

129 %.... 

130 % get a term (context used to force brackets around lists within lists) 

131 lerm(:0, :1, :2 f :3, :4, :5) : t(:0 T :1, :2 t :3, :4, :5) * T * [] 

132 %% 0 firstch, 1 terminator, 2 termrepr, 

133 %% 3 rest_of Jermrepr, 4 context, 5 sym tab 

134 term(:0, _, J : err (term, :0) , [] 

135 t{:0, :1, :2, :3, , :4) : variable(:0, :1, :2 ( :3, :4) . Q 

136 t{:0, :1 t :2, :3, inargs, :4) : list(:0, :1, :2, :3, :4) . Q 

137 t(:0, :1, T.:2, :3, inlist, :4) : list(:D, :1, :2, :4).[] 

138 % a dirty patch for negative numbers 

139 tr-\ :0, :1, :2, :3) : 

140 rdch(:4) , numberorfterm(:4, :0, :1, :2, :3) . [] 

141 %% 0 terminator, 1 termrepr, 2 rest_of Jermrepr, 

142 %% 3 sym jab, 4 nextch 

143 t(:0, :1, :2, :3, J : number(:0, :1, :2, :3) . [] 

144 t(:0, :1, :2, :3, :4) : fterm(:0, :1, :2, :3, :4) . 0 

145 %.- 

146 numberorfterm(:0, :1,2:3, J : 

147 digrt(:0) . T , number(:0 t :1, :2, :3) . [] 

148 %% 0 nextch, 1 terminator, 2 termrepr, 3 restof Jermrepr 

149 numberorfterm(:0, :1, :2, :3, :4) : 

150 symbol(:0, :5, :2, ”’\:6). args(:5, :1, :6, :3, :4) . [) 

151 %% 0 nextch, 1 terminator, 2 termrepr, 3 rest_ofJermrepr, 

152 %% 4 sym Jab, 5 symboljerminator, 6 middletermrepr 

153 %. ------- 

154 % get a variable 

155 variable(:0, :1, :2, :3, :4) : varstart(:0) . alphanums(:Q, :1, :5,[]). 

156 findv(;5, ;2, :3, :4) . T.Q 
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157 %% 0 firstch, 1 terminator, 2 termrepr, 

158 %% 3 rest_of_termrepr 4 symjab, 5 name 

159 findv('J.[| p ‘J.:0 P :0 P J : Q % no search: an anonymous variable 

160 %% 0 rest j>f Jermrepr 

161 findv(:0, :2 P :3>: look(:0 t 0, :4 t :3) . $etn{:4, :1. :2)*G 

162 %% 0 name, 1 termrepr. 2 rest j)fJermrepr, 3 symjab, 4 num 

163 % look counts from 0 and finds the position of a name in the symtab 

164 look(:0, :1 P :1, :0.:2) : □ 

165 %% 0 name, 1 num, 2 symtabtaii 

166 look(:0, 2 P :1, _.;3) : sum(:2, 1. :4) . look(:0 p :4 t ;1 P :3> . 0 

167 %% 0 name, 1 num, 2 currnum, 3 symtabtaii, 4 curmumplusl 

168 % set a number: no more than two digits (should be enough) 

169 seln(:0, :1.:2, :2) : 'less’(:Q, 10), 

170 ordchr(:3 P O’). sum(:3 J :0, :4) , ordchr(:4 5 :1) . [] 

171 %% 0 num, 1 char, 2 rest of termrepr, 3 k, 4 kplusnum 

172 setn{:0, :1 5 :2) : ’lessTG, 100) . prod(10 p :3, A , :0) . 

173 setn(:3, :1, :5) . setn(:4, :5> :2) . Q 

174 %% 0 num, 1 termrepr, 2 rest of jermrepr, 

175 %% 3 numbylO, 4 nummodIO, 5 middletermrepr 

176 setn(:0, J : errfsetn, :0) . [] 

177 %. ------- . 

178 % get a list in square brackets 

179 listen 2, ^3) : rdchsk(:4) . endlist(:4, ;1, 2, :3) . 

180 rdchsk(:0) . Q 

181 %% 0 terminator, t termrepr, 2 rest_ofjermrepr, 

182 %% 3 symjab, 4 nextch 

183 endlist(T, t-T-:0 t :0, J : Q 

184 %% 0 restof Jermrepr 

185 endlist(:0 P :1, :2 ( :3) :* ' 

186 term(:0 p :4, :1, V.:5, inlist, :3) . Itail(:4, :5, 2, :3). Q 

187 %% 0 firstch, 1 termrepr, 2 rest_of_termrepr, 

188 %% 3 symjab, 4 nextch, 5 middletermrepr 

189 MKT, T T-:0. :0, J : T . Q 

190 %% 0 rest of termrepr 

191 ftaif(T, :0 P :1 t 2) :T , rdchsk(:3) . variable(:3,T,:0.:1 p :2){] 

192 %% 0 termrepr, 1 rest of termrepr, 2 symjab, 3 nextch 

193 Haile/, :0, :1, :2) : T . rdchsk(:3) . 

194 term(:3, :4, :0, inlist, :2) . Itail(:4 p :5, :1, :2) . [] 

195 %% 0 termrepr, 1 rest_of Jermrepr, 2 sym tab, 

196 %% 3 term firstch, 4 nextch, 5 middletermrepr 

197 ltai!(:0, , _ f J : err(ltail, :0) . [] 

198 %----- 

199 % numbers: only natural ones 

200 number(:0 P :1, 2, :3) : digit(:0) . digits{:0, :1 p :2 P :3) . Q 

201 %% 0 firstch, 1 non digit, 2 termrepr, 3 rest of termrepr 

202 digits(:0 P :1 p ;0.:2, :3) : digit(:0) T 

203 T . rdch(:4) . digits(:4 p :1, :2, :3) . [] 

204 %% 0 firstch, 1 non_digrt, 2 termrepr, 3 rest j>f Jermrepr, 

205 %% 4 nextch 

206 digits(_, :0, :1, :1) : skpb(:0) . [] 

207 %% 0 nondigit, 1 restof Jermrepr 

208 %. ---------- 
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209 % auxiliary tests 

210 wordstart(:0) : smalletter{:0), [] 

211 varstart(:0) : bigletter(:0> * 0 

212 varstartO : D 

213 %---- .. 

214 skpb(:0) : skipbl, lastch(:0) . [] 

215 %... 

216 % output the translation 

217 puttr(Q): T . [] 

218 puttr(:0.:1) : wch(:0) . puttr(:1) . [] 

219 putvarnames{:0 T J : var(:0) . T . nL [] 

220 %% 0 symtabend 

221 putvarnames(:0.:1 5 :2) : nextline(:2) . wch(‘') . display! :2). 

222 puttr( s \:0) .wch(7) . sum(:2, 1, :3) . putvarnames(:1, :3) . [] 

223 %% 0 currname, 1 symJab Jail, 2 eurrnum. 3 nextnum 

224 nextline(;0) ; prod(6 t _ P 0, :0) * T . nl . display!' %%') ■ [] 

225 %% 0 a^mu It ip leofli nesiz e 

226 nextlineQ; [j 

227 % % % the end % % % 

228 : display( MI BOOTSTRAPPER h loaded/) * nl< see(user) . [] # 
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User Interface and Utilities 


1 % Interpreter of Toy-Prolog - the Prolog part, 

2 % (c) COPYRIGHT 1983 - Feliks Kluzniak, Stanislaw Szpakowicz 

3 % Institute of Informatics, Warsaw University 

4 % I::::::::::::::::::::::::::::::;:;;;:::;::::::::: 

5 % interactive driver - top level 

6 % I::::::::::::-:::::::::::::::::::::::::::::::;::::::::::::::::::::: 

7 ear nl, display (Toy-Prolog listening:'), nl, tag(bop). 

8 ear > halt(Toy-Probg, end of session/), 

9 

10 loop > repeat, 

11 displayf?- % read(Term, Sym tab), exec(Term, Symjab), fail, 

12 

13 stop tagfail(loop), 

14 

15 execfe r r\ J !, % this covers variables, too 

16 exec(:-(Goals), _) f, once (Goa Is), 

17 exec(N,J > integer(N), !, num_clause, 

18 exec(Goals, Sym_tab) 

19 call(Goals), numbervars(Goals, 0, J, 

20 printvars(Sym_tab), enough, l 

21 exec(_ P J display(no), nl, % if call(Goals) fails 

22 

23 enough rch, skipbl, lastch(Ch), rch, notHCh, 

24 

25 printvars(SymJab) var(Sym tab), display(yes), nl p !. 

26 printvars(Symjab) prvars(Symjab), 

27 

28 prvars(Sym tab) > var(Sym tab), I. 

29 prvars([var(NameString, instance) | Symjabjail]) 

30 wri let ext (N ame St ring), d ispl ay(' ~ '), 

31 side_effects(outt( Instance, fd{_, J, q)>, 

32 % this is equivalent to writeq(lnstanee) but we avoid 

33 % superfluous calls on numbervars - cf WRITE 

34 nl, prvars(Sym_tab_tail). 

35 

36 num clause display(’+++A number can"t be a clause.'), nl. 

37 

38 % read a program upto end. (the only way to define user procedures): 

39 % consuft/reconsuft must be issued from the terminal, and it returns 

40 % there ( consult(user) is correct, too ) 

41 consuft(File) seeing(OldF), readprog(File), see(OldF). 

42 reconsult(File) > 

43 redefine, seeing(OldF), readprog(File), see(OldF) p redefine. 

44 readprog(user) > I, get prog, 

45 readprog(File) see(Fife) p echo, getprog, noecho, seen. 

46 

47 % the actual job is done by this procedure 

48 getprog repeat, read(T), assimilate(T), -{T, end), I. 

49 

50 assimilate('e r f) > I. 


% a variable is erroneous, too 
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51 assimilate ->(Left, Right}) 

52 !, tag{transl_rule(Left, Right, Clause)), a ssertz( Clause). 

53 assimi!ate{ i-(Goal)) !, once(Goal). 

54 assimitate(end) I. 

55 asslmitate(N) > integer(N), !, num_clause. 

56 % otherwise - store the clause 

57 assimilate(Ctause) assertz(Clause). 

58 

59 

60 

61 % ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

62 % reading a term 

63 %::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

64 read(T) read(T f Symjab). 

65 read(T, Symjab) > 

66 gettr(T_internal t Symjab), !, maketerm(T internal, T). 

67 % if gettr tails, then.., 

68 read(’e r r\ J 

69 nl t display(’+++ Bad term on input Text skipped: % skip, nl. 

70 

71 % skip to the nearest full stop not in quotes or in comment 

72 skip lastch(Ch), wch(Ch), skip(Ch). 

73 

74 skip(.) rch, lastch(Ch), e_skip(Ch) ( L 

75 skip(’%') skipcomment, I, rch, skip. 

76 skip(G) isquote(Q), skip_s{Q), !, rch, skip. 

77 skip(J rch, skip. 

78 

79 % stop on a "layout 11 character 

80 e skip(Ch) > @«<(Ch, 1 ’). 

81 e_skip(Ch) wch(Ch), rch, skip. 

82 

83 skip comment > repeat, rch, lastch(Ch), wch(Ch), iseoInfCh), l 

84 

85 isquoteCy isquoten 

86 

87 % skip a string 

88 skips(Q) repeat, rch, lastch(Ch), wch(Ch) t =(Ch, Q), I, 

89 

90 % 

91 % parser 

92 % 

93 % This is an operator precedence parser for Prolog-10, gettr 

94 % constructs the internal representation of a term. Next, make- 

95 % t e r m constructs the term proper - see read. Here is an in- 

96 % formal description of the underlying operator precedence grammar 

97 % (each "rule" corresponds to one clause of reduce). Sides are 

98 % separated by —> and multiple righthand sides by OR. 

99 % t --> variable OR integer OR string 

100 % 1 ==> identifier 

101 % t ==> identifier (t) 

102 % t —> [] OR {} 
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103 % t —> (t) OR [t) OR {I } 

104 % t —> [ t 111 

105 % t i postfix Junctor 

106 % t t infixjunder t 

107 % t =«=> prefix Junctor t 

108 % Sequences of terms separated by commas - in rules 3 ( 5, 6 - will be 

109 % recognised as comma-terms (commas are infix functors, covered by 

110 % rule 8), There are five types of operators, vns(_), id(J, 

111 % ff(_, J, br{_, J, bar: see the scanner. The terminal symbol dot 

112 % never gets onto the stack. The terminal symbol bottom is never re- 

113 % turned by the scanner; it is only used to initiate and terminate the 

114 % main loop (p a r s e). The only nonterminal symbol is l(J, 

115 % There are five types of internal representations (Args denotes the 

116 % representation of arguments - usually a comma-term): 

117 % tr(Name, Args) - for functor-terms, 

118 % arg0(X) - for X a variable, atom, number, or string, 

119 % bar(X, Y) - for a list with front X and tail Y, 

120 % trlfName, X) - for prefix and postfix functors, 

121 % tr2(Name, X, Y) - for infix functors. 

122 % A Name in tr may be a bracket type. See reduce (clauses 5, 6) 

123 % and maketerm for details. 

124 

125 % - - - get the internal representation of a term 

126 gettr(X, Symjab) 

127 gettoken(T, Symjab), parse([bottom], T, X, Sym tab). 

128 

129 % p a r s e takes 4 parameters: the current stack, the current token 

130 % from input, the variable used to bring the internal representation 

131 % to the surface, and the symbol table (used by g e 11 o k e n) 

132 parsebottom], dot, X, J L 

133 parse(Stack, Input, X, Sym tab) 

134 topterminal(Stack, Top, Pos), 

135 establish precedencefTop, Input, Pos, Rel, RTop, RInput), 

136 exchjop(Top, RTop, Stack, RStack), 

137 step (Rel, R Input, RStack, NewStack, New In put, Sym tab), 

138 parse (NewStack, Newlnput, X, Sym tab). 

139 

140 % the topmost terminal will be covered by at most one nonterminal 

141 % (the third parameter gives Top's position: 1 on the top, 2 covered) 

142 topterminaI([t(_) t Top | J, Top, 2) !. 

143 topterminal(ITop | J, Top, 1), 

144 

145 % change the topmost terminal (applies only to mixed functors) 

146 exchjopCTop. Top, Stack, Stack) f. 

147 exchJopL, RTop, Jt(X), _ I SI, [t(X), RTop | S]) l 

148 exchJopL RTop, [_ | S], [RTop | S]), 

149 

150 % — perform one step; shift (stack the current token) or reduce 

151 stepflseq, R Input, Stack, [RInput | Stack], Newlnput, Symjab) :• 

152 f, gettoken(New!nput, Symjab), 

153 step(gt, RInput, Stack, NewStack, RInput, J 

154 reduce(Stack T NewStack), !. 
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155 % fail if reduction impossible (parse and gettr will fail, too - 

156 % this failure will be intercepted by gettr's caller) 

157 

158 % reduce top segment of the stack according to the underlying grammar 

159 reduced vns(X) | S ], [ t(arg0(X}) | S ]). 

160 reduced *d(l) | $], [t(argO(!» | S ]). 

161 reduced br(r, *0'), t(X) 1 br(l,'()’), id{!) | S ], 

162 [t(tr(l,X)> | SJ). 

163 reduced br(r P Type), br(l. Type) [ S ], 

164 [ t(argO(Type)) | S ]) :■ ratHType, '()*)). 

165 % T] 1 orsee p, 2nd clause 

166 reduced br{r, Type), t(X), br(l, Type) | S ], 

167 1 t(tr(Type P X)) | S ]). 

168 reduced br(r ( ■□'), t(Y), bar, t(X), br(l p T) I S ] y 

169 |t(bar{X,Y» | S ]). 

170 reduced ff(l, Type, J f t(X) | S ] ( 

171 | t(tr10i X)) | S 3) > 

172 ismpostf(Type). 

173 reduced t(Y), tf(l. Type, J, t(X) | S J, 

174 [ t(tr2(l, X, Y)) | S ]) > 

175 isminf(Type). 

176 reduced t(X), «(*, Type, J | S ], 

177 [t(tr1(l f X)) | SJ) 

178 ismpref(Type). 

179 % otherwise faii (cf step) 

180 

181 % - - - auxiliary tests for the parser 

182 ispref(fy). ispref(fx), 

183 

184 ispostf(yf), ispostffxf). 

185 

186 ismpref([TUn]) ispref(TUn). 

187 ismpref(L Tyn l) > ispref(TUn), 

188 

189 ism inf dTB in]) > member{TBin t [xfy, yfx, xfx]}. 

190 isminf([_, J). 

191 

192 lsmpostf([TUn]) ispostf(TUn). 

193 ismpostf(| , TUn]) > ispostf(TUn), 

194 

195 % - - - establish precedence relation between the topmost 

196 % terminal on the stack and the current input terminal 

197 establish_precedence(Top, tnput, Pos P Rel P RTop, Rlnput) > 

198 p(Top, Input, Pos p RelO), 

199 tinalize(Rel0 P Top, Input, Rel, RTop P Rlnput), !. 

200 

201 fmalize(teeq, Top, Input, Iseq, Top, Input). 

202 finalizefgt, Top, Input, gt, Top, Input). 

203 finalize(lseq{RTop, Rlnput), _, Iseq, RTop, Rlnput), 

204 finalize(gt(RTop, Rlnput), , , gt, RTop, Rlnput). 

205 

206 p(id(_), br(l, '()% 1» Iseq). 
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207 p(br(l, Type), br{r, Type), Iseq). 

208 p(br(l, Q), bar, 2, Iseq). 

209 p(bar, br(r, Q), 2, Iseq). 

210 

211 p(Top, Input, 1, gt) 

212 vns_id_br(Top, r), br_bar(lnput, r). 

213 p(Top, ff(N, Types, P), 1, gt(Top, ff(N, RTypes, P))) 

214 vns_id_br(Top, r), restrict(Types, [fx, fy], RTypes). 

215 p(Top, Input, 1, Iseq) 

216 br_bar(Top, I), vns_td_br(lnput, I). 

217 p(Top, ff{N, Types, P), Pos, IseqjTop, 1f{N, RTypes, P»> :• 

218 br_bar(Top, I), pre_inpost(Pos, Types, RTypes). 

219 p(ff(N, Types, P), Input, Pos, gt(fl(N, RTypes, P), Input)) 

220 br_bar(lnput, r), pos$Jnpre(Pos, Types, RTypes). 

221 p(ff(N, Types, P), Input, 1, lseq(ff{N, RTypes, P), Input)) 

222 vns_M_br(lnput, I), restrict (Types, [xf, yf], RTypes). 

223 

224 % functors with equal priorities 

225 p(ff(MTop, TsTop, P), 1f{Nfnp, Tslnp, P), Pos t Rel) 

226 res_confl{TsTop, Tslnp, Pos. RTsTop. RTstnp, RefO). 

227 I, do_rel(RelO t fffNTop, RTsTop. P), ff(Nlnp, RTsinp, P). Rel). 

228 % different priorities 

229 p(ff{NTop, TsTop. PTop). ff(Nlnp. Tslnp. Plnp), Pos. 

230 gt{ff(NTop, RTsTop, PTop), ff(Nlnp. RTsinp, Pfnp))) > 

231 stronger (PTop, Plnp), !, 

232 restrict(Tslnp. [fx, fy], RTsinp), 

233 post_inpre(Pos, TsTop, RTsTop). 

234 p(ff(NTop, TsTop, PTop), ff{NInp + Tslnp, Plnp), Pos, 

235 lseq(ff(NTop, RTsTop, PTop), ff(Nlnp, RTsinp, Plnp))) 

236 stronger Plnp, PTop), !, 

237 restrict(TsTop, [xf, yf], RTsTop), 

238 pre_inpost(Pos, Tslnp, RTsinp). 

239 

240 p(_, dot, gt), 

241 p(bottom t , Iseq). 

242 % otherwise fait {parse fails, too) 

243 

244 vnsjd br(vns(J, J. 

245 vns_id_br(kJ(J, J. 

246 vnsJd_br(br{LeftRight 3 J, LeftRight), 

247 

248 br_bar(br{ Left Right, J, LeftRight). 

249 br bar(bar, J. 

250 

251 stronger!Priori, Priori) less(Prior1, Prior2). 

252 

253 preJnpost(1, Types, RTypes) > % the functor must be prefix 

254 restfict(Types, [xf, yf}, A), 

255 restrict{A, [xfy, yfx, xfa], RTypes). 

256 preJnpost(2, Types, RTypes) > % the functor must not be prefix 

257 rest riot (Types, [fx, fy], RTypes). 

258 
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259 post_inpre{1, Types, RTypes) % the functor must be postfix 

260 restrict(Types, [fx, fy], A), 

261 restrict(A, {xfy, yfx, xfx], RTypes}. 

262 post_inpre(2, Types, RTypes) % the functor must not be postfix 

263 restrict{Types, [xf, yf], RTypes), 

264 

265 % leave only those types that do not belong to RSet, 

266 % fail if this would leave no types at all {RSet 

267 % contains only binary types, or only unary types) 

268 restrict([T] t RSet, [T|) f, not(member(T, RSet)). 

269 restrict([TBin, TUn], RSet, [TBin]) > member(TUn, RSet), L 

270 restrid([TBin, TUn], RSet, [TUn]) member{TBin, RSet), I. 

271 restrict{Types, Types). 

272 

273 % compute relation for two functors with equal priorities; tour cases: 

274 % both normal. Top mixed, Input mixed, both mixed 

275 res_confl([TTop], [Tlnp], Pos, [TTop], [Tinp], RelO) 

276 I, ff_p(TTop, Tlnp, Pos, RelO). 

277 res_confl([TTopBin, TTopUn], [Tlnp], Pos, RTsTop, [Tlnpl, RelO) 

278 I, ff_p{TTopBin, Tlnp, Pos, RetB), 

279 ff_p(TTopUn, Tlnp, Pos, RelU), 

280 match_rels(RelB, RelU, RelOJTopBin, TTopUn, RTsTop). 

281 res_confl([TTop], [TlnpBin, TlnpUnj, Pos, [TTop], RTsInp, RelO) > 

282 !, ff_p(TTop, TlnpBin, Pos, RelB), 

283 ff_p{TTop, TlnpUn, Pos, RelU), 

284 matchjre!s{RelB, RelU, RelO, TlnpBin, TlnpUn, RTsInp). 

285 res_confl([TTopBin, TTopUn], fTInpBin, TlnpUn], Pos, RTsTop, RTsInp, 

286 RelO) ff_p(TTopBin, TlnpBin, Pos, ReIBB), 

287 ff_p(TTopBin, TlnpUn, Pos, ReIBU), 

288 ff jp(TTopUn, TlnpBin, Pos, RelUB), 

289 ffjp(TTopUn, TlnpUn, Pos, RelUU), 

290 res_mixed{Re!BB, ReIBU, RelUB, RelUU, RelO, 

291 TTopBin, TTopUn, TlnpBin, TlnpUn, RTsTop, RTsInp), L 

292 

293 do_rel(lseq, TopF, InpF, lseq(TopF, InpF)). 

294 do_rel(gt, TopF, InpF, gt(TopF, InpF)). 

295 % fail if RelO = err 

296 

297 match_re!s{Rel, Rel, Rel, TBin, TUn, [TBin, TUn]) > l % err included 

298 match_rels(err, Rel, Rel, TUn, [TUn]) L 

299 match_rels(Rel, err, Rel, TBin, _, (TBin]) > I. 

300 match_rels(_, err, TBin, TUn, [TBin, TUn]), 

301 

302 res_mixed£Rel0, RelO, RelO, RelO, RelO, 

303 TTo p Bin, TTo p U n, Tl npB in, Tl npU n, 

304 [TTopBin, TTopUn], [TlnpBin, TlnpUn]). 

305 res_mixed(err, err, RelUB, RelUU, RelO, 

306 TTopUn, TlnpBin, TlnpUn, [TTopUn], RTsInp) 

307 match_rels(ReiUB, RelUU, RelO, TlnpBin, TlnpUn, RTsInp), 

308 res_mixed(RetBB, ReIBU, err, err, RelO, 

309 TTopBin, TlnpBin, TlnpUn, [TTopBin], RTsInp) 

310 match rels(ReIBB, ReIBU, RelO, TlnpBin, TlnpUn, RTsInp), 
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311 resjnixedferr, ReiBU, err, ReiUU, RelO, 

312 TTopBin, TTopUn, TlnpUn. RTsTop, [TlnpUn]) 

313 match_rets{ ReiBU, ReiUU, RelO, TTopBin, TTopUn, RTsTop). 

314 res_mixed(RetBB, err, ReJUB, err, RelO, 

315 TTopBin, TTopUn, TlnpBin, RTsTop, [TlnpBin]) ;« 

316 match_rels(Re!BB, RelUB, RelO, TTopBin, TTopUn, RTsTop). 

317 res_mixed(_ 1 err, , , ( ,J. 

318 

319 % establish precedence relation for two (basic) types 

320 ft j>(TTop, Tlnp, Pos, Iseq) 

321 member(TTop, [xfy, fy]), % right_associative 

322 ff_p_aux1(Pos, Tlnp), l 

323 ff j>(TTop, Tlnp, Pos, gt) 

324 member(Tlnp, [yfx, yf]), % left associative 

325 ff_p_aux2(Pos, TTop), I. 

326 ffen), 

327 

328 ffjj_aux1(1, Tlnp) > ispref(Tlnp). 

329 ff_p_aux1(2, Tlnp) member(Tlnp, [xfy, xf, xfx]). 

330 

331 ffj)_aux2(1, TTop) > ispostf(TTop). 

332 ff_p aux2(2, TTop) > member(TTop, [yfx, fx, xfx]). 

333 

334 % I:::;::::::::::::::::::::::::::;:;:;::::::::::;:::::::::::;:::::: 

335 % internal representation — > term 

336 % I:::::::::::::::::::::::::::::::;:;;::::::::::::::::::::;:;:;:::: 

337 maketerm(argO(X), X) > 1 % variable, atom, number, string 

338 maketerm(tr('()\ RawTerm), T) 

339 I, maketerm(RawTerm, T), 

340 m akete rm(b ar( R awListR awTa il), T) 

341 I, maketermjRawTail, Tail), 

342 makelist(RawList, Tail, T). 

343 maketerm(tr( p []'. RawList), T) 

344 f, makelist( RawList, '|J ( T). 

345 maketermftrQ}', Raw Arg), '{K(Arg)) 

346 I, maketerm(RawArg, Arg). 

347 maketerm{tr(Name t RawArgs), T) 

348 I, makelist(RawArgs, P Q\ Args), 

349 *..(T, [Name | Args]), 

350 maketerm(tr2(Name, RawArgl, RawArgS), T) 

351 I, maketemi(RawArg1 t Argl), maketemn(RawArg2, Arg2), 

352 =„(T, [Name, Argl, Arg2J). 

353 maketerm{tr1 (Name, Raw Arg), T) 

354 maketerm(RawArg, Arg), —..(T, [Name, Arg]), 

355 

356 % comma-term to dot-list-with-Tail 

357 makel(St(tr2(7, RawArg, RawArgs), Tail, [Arg | Args]) 

358 I, maketemn{ RawArg, Arg), makettst(RawArgs, Tail, Args). 

359 makelist(RawArg, Tail, [Arg | Tail]) maketerm(RawArg, Arg). 
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361 %::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

362 % scanner 

363 %::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

364 % this scanner returns six kinds of tokens: 


365 

% 

vns(J 

variables, numbers, strings 

366 

% 

id(Name) 

atoms 

367 

% 

fffName, Types, Prior) "fix” functors 

368 

% 

br(Which, Type) 

brackets {left/right, WWIY) 

369 

% 

bar 

| (in lists) 

370 

% 

dot 

, followed by a layout character 


371 

372 % — read a token and construct its internal form 

373 % the input is supposed to be positioned 

374 % over the first character of a token (or preceding 'white space") 

375 gettokenfToken, Symjab) 

376 skipbl t lastch(Startch), absorblo ken (Sta rich, Rawloken), f, 

377 maketoken(Rawtoken P Token, Symjab), L 

378 

379 % — read in a suitable sequence of characters 

380 % a word, ie a regular alphanumeric identifier 

381 absorbtokenfOh, id([Ch | Wordtaif])) > 

382 wcrdstart(Ch), getword(Wordtail). 

383 % a variable 

384 absorbtoken(Ch, var([Ch | Tail])) > 

385 varstart(Ch), getword(Tail). 

386 % a solo character is a comma, a semicolon or an exclamation mark 

387 absorbtokenfOh, id([Ch])) solochar(Ch), ich. 

388 % a bracket, ie () f ] { } 

389 absorbtokenfOh, br(Wh, Type)) > 

390 bracket(Ch), bracket(Ch, Wh, Type), rch* 

391 absorbtokenf'f; bar) rch. 

392 % a string in quotes or in double quotes 

393 absorbtokenf”", qfci(Qname)) > 

394 rdchtNextch), getstring("‘\ Nextch, Qname). 

395 absorbtoken( M,, t str( String)) > 

396 rdch(Nextch), get st ring (■*', Nextch, String). 

397 % a positive number 

398 absorbtoken(Ch, num([Ch | Digits])) > 

399 dlgitfCh), getdigits(Digits). 

400 % a negative number or a dash (possibly starting a symbol, see below) 

401 absorbtokenf-, Rawloken) > rdch(Ch), num_or_sym(Ch, Rawtoken), 

402 absorbtokenf., Rawtoken) > rdch(Ch), dot_or_sym(Ch, Rawloken). 

403 % a symbol, built of @ ~ - ‘ 

404 absorbtokenfOh, id([Ch | Symbs])) symch(Ch), gelsym(Symbs). 

405 % an embedded comment 

406 absorbtoken('%\ Rawtoken) 

407 Skipcomment, lastehfCh), absorbtokenfOh, Rawtoken). 

408 % this shouldn't happen: 

409 absorbtoken(Ch, J > display(errinscan(Ch)), nl, fail. 

410 

411 num_or_sym(Ch, num([-, Ch | Digits]}) 

412 digitfCh}, getdigits(Digits). 
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413 num_or_sym(Ch, td([- p Ch | Symbs]))symch(Ch), getsym(Symbs). 

414 num_or_sym(_, id([-])). 

415 

416 % layout characters precede ' 1 in ASCII 

417 dotor$ym{Ch, dot)@«<{Ch, 5 7* % no advance 

418 dotj>r_sym{Ch ( id((„ Ch | Symbs])) symch(Ch), getsym(Symbs). 

419 dot_or sym(_, idtf.j)), 

420 

421 skipcomment la$tch(Ch) p iseoln(Ch), skipbl t !, 

422 skipcomment rch, skipcomment. 

423 

424 % - - - auxiliary input procedures 

425 % read an alphanumeric identifier 

426 getword([Ch | Word]) 

427 rdch(Ch), alphanum(Ch), ! p getword(Word). 

428 getwordtf]). 

429 

430 % read a sequence of digits 

431 getdigits([Ch | Digits]) > 

432 rdch(Ch) p digit(Ch), I, getdigits{Digits), 

433 getdigits([]), 

434 

435 % read a symbol 

436 getsym([Ch | Symbs]) > 

437 rdch(Ch), $ymch(Ch) P f, getsym(Symbs). 

438 getsym(O), 

439 

440 % read a quoted id or siring (Delim is either * or") 

441 getstring(Delim, Delim, Str) 

442 I, rdch(Nextch), twodelims( Delim, Nextch, Str). 

443 get$tring( Delim, Ch p [Ch | Str]) > 

444 rdch(Nexteh), getstnng(Delim, Nextch, Str). 

445 twodel<ms{Delim, Delim, [Delim | Str]) > 

446 f t rdch(Nextch), getstring(Delim, Nextch, Str), 

447 twodeimsL, _ p []). %close the list 

448 

449 % — auxiliary tests 

450 wordstart(Ch) smalletterfCh). 

451 varstart(Ch) bigletter(Ch), 

452 vafstartfj)- 

453 brackets[ f i <■ W- bracketf)', r, ’()■)■ 

454 bracketf K ’[]% bracketfT, r, TJ), 

455 bracket^', t f ’{}% bracketf} 1 , r, '{}% 

456 

457 % - - - transform a raw token into its final form 

458 maketoken{var(Namestring), vns(Ptr), Symjab) > 

459 makeptrtNamestring, R r t Symjab). 

460 maketoken(id(Namestring), Token, J > 

461 pname(Name, Namestring), makeJf_or_id{ Name, Token). 

462 maketoken(qid(Namestring), id(Name), J 

463 pname(Name, Namestring). 

464 maketoken(num{[* | Digits]), vns(N), J > 
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465 pnamei(N1, Digits), sum(N, Nt ( 0), 

466 maketoken(num(Digits), vns(N), J > pnamei(N, Digits). 

467 maketoken{str(Chars), vns(Chars), J. 

468 maketoken(Token, Token, J. % br(_ t J and bar and dot 

469 

470 % variables are kept in a symbol table (an open list} 

471 makeptCJ, J- %no search - an anonymous variable 

472 makeptr(Nmstr, Ptr, Sym tab) look_var(var(Nmstr, Ptr) p Symjab). 

473 

474 % look-up 

475 lookj/arfltem, [Item | Symjab]). 

476 look_var(ltem t [_ | Symjab]) look_yar(ltem, Sym tab). 

477 

478 makeJf_or_id(Name, ff(Name, Types, Prior)) 

479 'FF'(Name, Types, Prior), I. 

480 makejff_or_id(Name, id(Name)), 

481 

482 %::::::::::::::::::::::::::::::::::::::::::::::::: 

483 % grammar rule preprocessor 

484 %::::::::::::::::::::::::::::::::::::::::::::::::: 

485 transl rnle(Left 3 Right, Clause) > 

486 two_ok(Left, Right), 

487 isolateJhsJ(Left, Nont, LhsJ), 

488 connect(LhsJ, Outpar, Finalvar), 

489 expand(Nont, Initvar, Outpar, Head), 

490 makebody(Right, Initvar, Finalvar, Body, Alt flag), 

491 do_clause(Body, Head, Clause). 

492 

493 <todau$e(true. Head, Head) L 

494 do_clause(Body, Head, :-(Head, Body)). 

495 

496 % LhsJ is a list (possibly empty) of lefthand side terminals 

497 isolate lhs t{7(Nont, LhsJ), Nont, LhsJ) 

498 7(nonvarint(Nont), rulerror(varint)), 

499 V(iselosedlist(LhsJ), ruleirorfter)), I. 

500 isolateJhsJ(Nont, Nont, []), 

501 

502 % fait if not a closed list 

503 isclosedlist(L) cheek(isctl(L)). 

504 iscll(L) > var(L), 1, fail. 

505 isoll([]) T 

506 iscll(L I L]> isctl(L). 

507 

508 % connect terminals to the nearest nonterminal’s input parameter 

509 % (actually, "open" a closed list) 

510 connect^ Nextvar, Nextvar) > l 

511 connect([Tsym | Tsyms], [Tsym | Outpar], Nextvar) 

512 connect(Tsyms, Outpar, Nextvar), 

513 

514 % - - - translate the righthand side (loop over alternatives) 

515 % in alternatives, each righthand side is preceded by a dummy 

516 % nonterminal, as defined by ' dummy* ~> []. (since terminals 
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517 % are appended to input parameters, the input parameter of a common 

518 % left hand side must be a variable) 

519 makebody{V(Alt, Alts), Initvar, Finalvar, 

520 VCT dummy 1 (Initvar, Nextvar), Alt b), Altbs), J 

521 1, two_ok(Alt, Alts), 

522 makeright(Alt, Nextvar, Finalvar, Altjo), 

523 makebody(Alts, Initvar, Finalvar, AK bs, alt). 

524 makebody(Right, Initvar, Finalvar, Body, Altjlag) 

525 var(Altjlag), t, % only one alternative 

526 make rig ht (Rig ht, I nitvar, F ina I var, Bod y). 

527 makebody(Right, Initvar, Finalvar, 

528 VC dummy s (Initvar, Nextvar), Body), alt) 

529 make right (Right, Nextvar, Finalvar, Body). 

530 

531 % - - - translate one alternative 

532 makeright(7(ltem, Items), Thispar, Finalvar, T itemjtems) 

533 I, two_ok(ltem, Items), 

534 transl_item(ltem, Thispar, Nextvar, TJtem), 

535 makerightfltems, Nextvar, Finalvar, TJtems), 

536 cx>rrfoine(TJtem, TJtems, TJtem items). 

537 makeright(ltem, Thispar, Finalvar, T_Hem) 

538 transl item (Item, Thispar, Finalvar, TJtem), 

539 

540 combine(tme, TJtems, TJtems) !. 

541 combinefTJtem, true, TJtem) t. 

542 combine(TJtem, TJtems, V{T item, T items)). 

543 ” 

544 % — translate one item (sure to be a functor-term) 

545 transl item(Terminals, Thispar, Nextvar, true) 

546 isclosedlist(Terminals), 

547 !, connect(Terminals, Thispar, Nextvar). 

548 % conditions (the cut and others) 

549 transljtem{!, Thispar, Thispar, !) L 

550 transMtemQj'fCond), Thispar, Thispar, call(Cond)) L 

551 % bad list of terminals (missed the first clause) 

552 transl_item(L I J, J mlerror(ter), 

553 % a nested alternative 

554 transl item(V(X, Y), Thispar, Nextvar, Transl) 

555 !, makebody('; T (X, Y), Thispar, Nextvar, Transl, J. 

556 % finally, a regular nonterminal 

557 translJtem(Nont, Thispar, Nextvar, Transl) > 

558 expand!Nont, Thispar, Nextvar, Transl). 

559 

560 % add input parameter and output parameter 

561 expand(Nont, lnj>ar, Out_par, Call) 

562 *..(Nont, [Fun | Args]}, 

563 =..(Call, [Fun, lnj)ar, Out_par | Args]). 

564 

565 % - - - error handling 

566 two_ok(X, Y) nonvarint(X), nonvarint(Y), !. 

567 two_ok(_, j > rulerror(vaiint). 

568 
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569 rulerror(Message) 

570 nl, display(V++ Error in this rule: *), mes(Message), nl, 

571 tagfail(transl_rule(_, _ p _)). 

572 % diagnostics are only very brief (and not too informative ...) 

573 mes(varint) > d!splay(Variable or integer item.'). 

574 mes(ter) displayOemn Inals not on a closed list.'). 

575 

576 % — initiate grammar processing 

577 phrase(Nont, Terminals) > 

578 nonvarint(Nont), I, 

579 expand(Nont, Terminals, Q, Initjsall), 

580 call(lnit_call). 

581 phrase(N, T) error(phrase(N, T)). 

582 

583 T du mm/(X, X), 

584 

585 % *««—***«*««•«««*••«** 

586 % **««***«*«•«*«****«***«** 

587 % library 

ggg ty ********************************* 

589 % .... 

590 %::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

591 % =.. (read as ,, un(V") 

592 % 

593 -..(X, "0 > varfX), var(Y), I, erior(«.,(X, Y)). 

594 =..(Num, (Num]) integer(Num), I. 

595 -.♦(Term, [Fun | Args]) > 

596 setarity(Term, Args, N), 

597 funotor(Term, Fun, N), % this works both ways 

598 not(integer(Fun)), % we don’t want eg 17(X) 

599 setargsfTerm, Args, 0, N). % this works both ways, too 

600 

601 setarity(Term J Args, N) var(Term), 1, length(Args, N) + 

602 % notice that bad Args give an error in length 

603 setarity{_, _) + % Arity will be set by functor in 

604 

605 % both numeric parameters are given, 

606 % the loop stops when the third reaches the fourth 

607 % (works both ways because a r g does) 

608 setargs(_, n. N, N) > ! 

609 setargs(Term, [Arg | Args], K, N) 

610 sum(K, 1, K1), arg(K1, Term, Arg), 

611 setargs(Term, Args, K1, N). 

612 

613 % find the length of a closed list; error if not closed 

614 length(Ust T N) length(List, 0, N). 

615 

616 % this is a taiFrecursive formulation of length 

617 length(L,_ t J var(L), I, error(length(L, J). 

618 1ength(Q, N, N) 1. 

619 tength(L I List], K, N) 

620 !, sum(K, 1, K1), length(Ust, KT, N). 
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]ength(Bizarre, J > error(fength( Bizarre, J). 

% bind every variable to a distinct VfN) 
numbervafS(V(N), N, NextN) I, sum(N, 1, NextN). 
numbervarst’V'U, N, N) > !. 
numbervars{X, N, N) integer(X), l 
numbervar$(X, N, NextN) :*■ numbervars(X, 1, N, NextN), 

numbervars(X, K P N, NextN) 

arg(K, X, A) t !, numbervars(A, N, MkJN), 
sum{K P 1, K1) f numbeivarsIX. K1, MidN p NextN). 
numbervars(_, N, N). 


predefined "fix" functors and o p 


% (ordered according to probable frequency) 
■FFT,\ [xfy] ( 1000), 


621 
622 

623 

624 

625 

626 

627 

628 

629 

630 

631 

632 

633 

634 % 

635 % 

636 % 

637 

638 

639 'FP(> , [xfx, fx], 1200). 

640 TP(Y.[xfy], 1100). 

641 , FF*(not, [fy], 900). 

642 'FP(= , [xfx], 700), 

643 'FF'{is , [xfxj, 700), 

644 *FF , (“> I [xfx], 1200). 

645 TF'(+ , {yfx, fx], 500) 

646 TF(* , [yfx], 400). 

647 'FF(mod, [xfx] P 300). 

648 FF‘(< , [xfx] p 700). 

649 TF{> t [xfx], 700). 

650 'FPf-:-, [xfx], 700), 

651 TF(@< , [xfx], 700). 

652 , FF , (@> p [xfx], 700), 

653 'FF'(=.., [xfx], 700). 

654 ’FF'(== , [xfx] p 700). 

655 

656 

657 

658 


TP(- , [yfx, fx], 500). 
■FP(/ . [yfx], 400). 

*FF(-< , [xfx], 700). 
S FP(>= , [xfx], 700). 
'FF{«, [xfx], 700). 
T FF’(@=<, [xfx], 700). 
’FP(@>=, [xfxj, 700). 

’FF’f™, [xfx], 700). 


% this implementation of op takes care of redefinitions 
% and of mixed functors 
op(Prior, Type, Name) 


659 atom(Name), pname(Name a String), noq(String), 

660 % noq - see WRITE 

661 integer(Prior), fess(0, Prior), less(Prior, 1201), 

662 set_kind(Type, Kind), !, 

663 do_op( Prior, Type, Name, Kind). 

664 % if not alt parameters are OK - 

665 op(P, T, N) error(op( P, T, N )). 

666 

667 % set Kind to bin or un 

668 set_kind(Type, bin) > binary(Type, J, !. 

669 set_kind(Type, un) unary(Type, J, !. 

670 

671 % test for binary and instantiate Assoc 

672 binary(xfy, a(r)). % right associative 
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673 

674 

675 

676 

677 

678 

679 

680 
681 
682 

683 

684 

685 

686 

687 

688 

689 

690 

691 

692 

693 

694 

695 

696 

697 


binary(yfx P a(!)). % loft associative 

binary(xfx, na(J). % non-associative 

% test for unary, instantiate Kind and Assoc 
unary{fy, pre P a(r))< % right associative 

unary(fx t pre, na(r}). % right non-associative 

unaryjyf, post, a(l)>. % left associative 

unaryjxf, post, na(l)). % left non-associative 

do_op(P, T, N P Kind) 

'FF'{N P Oldtypes, Oidprior), 1, 
addff(Oldtypes P Oidprior, P, T P N P Kind), 
do_op{P P T P N, J assertz(TF h (N, [T], P)). 


% add or redefine a functor 

% for mixed functors, keep the binary type before the unary 


% the same priority: redefine or make mixed 
addff([Oldtype], P P P, T P N, Kind) 
f t se!_kind(01dtype, Oldkind), 
addff 1(Oidkind, Kind, Oldtype, T P N, P). 
addff([Oldtype1 P Otdtype2], P p P p T, N, Kind) 

!, addff2(Kind, Oldtypel p Oldtype2, T, P p N). 
% otherwise the priorities were different: redefine 
addff(_ T P p T t N p J redeff(N, [T], P). 


698 % make a mixed functor or change type 

699 addff1(bin P un t Oldtype, T, N P P) > mkjmixed(N P [Oldtype, U P), 

700 addff 1 (un, bin, Oldtype, T, N P P) mk_mixed(N P [T t Oldtype], P). 

701 addff 1 (Kind, Kind, T,N, P) redeff(N, [T], P), 

702 

703 % adjust a mixed functor by changing one of its types 

704 addff2(bin 1 Oidtype2 p T, P p N) mk_mixed(N p [T p Oldtype2] p P), 

705 addff2(un p Oidfypel, T, P ( N) mk_mixed(N p [Oldtypel, T], P). 

706 

707 mk_mixed(N, Types, P) 

708 re1ract(‘FF(N* _ p J), !, assertz(TF(N, Types, P)), 

709 

710 % redefine and issue a warning 

711 redeff(N P T, P) 

712 nl t displayHunctor display(N) p 

713 display ( h ' redefined 1 ), nl t 

714 retract ('FF^N, _, j), !, a$$erta('FF(N, T, P)}. 

715 

716 % remove a declaration 

717 delop(Name) :* atom(Name), retractCFF'IName, J). !. 

718 delop(Name) error(debp(Name)), 

719 

720 

721 % "I::::::::::::;::::::::::::::::::::::::::::::::::::::::::::”::; 

722 % evaluate an arithmetic expression 

723 % ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

724 is(N p N) integer(N), l 
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725 is(Val, +(A P B» > 

726 l is(Av, A), is{Bv P B) P sum(Av, Bv t VaJ). 

727 is(Val, -(A, B)) 

728 !, is(Av P A), is(Bv, B) p sum(Bv p Val P Av), 

729 is(Val p *(A P B)) > 

730 t» isfAv, A), is(Bv, B), prad(Av, Bv p 0, Val). 

731 is(Val p /(A, B)) 

732 ! p is(Av P A) p is(Bv, B), prod(Bv P Val, Av), 

733 is(Val t mod(A, B)) > 

734 l is(Av, A) P i$(Bv, B) P prod(Bv P _, Val, Av). 

735 is(Val,+{A)) ! p is(Val, A). 

736 is(Val, -(A)) > l is(Av, A), sum(Val P Av, 0). 

737 is(N p [N]) integer(N). 

738 % otherwise fall 

739 

740 %-- EVALUATE AN ARITHMETIC RELATION 

741 =*:=(X, Y) i$(XV, X), is(XV, Y). 

742 <(X, Y) is(XV, X), is(YV p V) p le$$(XV, YV). 

743 =«{X, Y) > is(XV, X), is{YV p Y), not(less(YV, XV)}. 

744 >(X P Y) ls{XV P X), is(YV p Y), lessfYV, XV). 

745 >={X, Y) > is(XV, X), is(YV, Y), not(less(XV, YV)}. 

746 =={X P Y) > not(=:=(X, Y)). 

747 

748 % 

749 % perfect equality of terms 

750 % I:::::::::::::::::::::::::::::::::::::::::::::::::::;:::::::::::: 

751 =-(T1, T2) var(T1) t var(T2), I, eqvar(T1, T2). 

752 -=(T1,T2) check(^?(T1 t T2}). 

753 

754 =-(T1 t T2) not(==?(T1, T2)). 

755 

756 -=?(T1 f T2) > 

757 integer(TI), integer(T2), \ t =(T1 t T2). 

758 ==?(T1, T2) > 

759 nonvarint(TI), nonvarint(T2), 

760 functor(T1, Fun P Arity), funclor(T2, Fun, Arity), 

761 equalargs(T1,12, 1). 

762 

763 equalargs(T1, T2, Argnumber) > 

764 arg{Argnumber, T1, Arg1) p arg(Argnumber, T2. Arg2) t 

765 % arg fails given too large a number 

766 t, ==(Arg1 p Arg2). sum( Argnumber, 1, Next number), 

767 equalargs(T 1, T2 p N ext nu mber) t 
788 equalargs(_ T , J. 

769 

770 % "I::::::::::::::::::::::::::::::::::::::::;:::::::;::::::;:::::::: 

771 % assert, asserta, assertz, retract, clause 

772 % I::::::::::::::::::::::::;::::::::;::::;:::::::::::::::::::::::;;;; 

773 % — add a clause (using built-in assert(_ p J) 

774 assert(CI) asserta(CI). 

775 asserta(CI) > 

776 norwarint(CI), convert(CI, Head, Body), I, 
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777 assertfHead, Body, 0). 

778 asserta(Cl) > error(asserta(CI)), 

779 

780 assertz(CI) 

781 r>onvarint{CI)p convert (Cl, Head, Body), \, 

782 assert(Head, Body, 32767). % ie 2 to 15th minus 1 

783 assertz(CI) error(assertz(CI)). 

784 

785 % convert the external form of a Body into a doited list 

786 convert(:-(Head, B), Head, Body) conv_body(B, Body). 

787 convert! Unit cl, Unrt_ct, 0). 

788 

789 % this procedure works both ways 

790 conv_body(B, [call(B)]) var(B), l 

791 conv_body(true, []). 

792 eonvbodyfB, Body) conv_b(B, Body). 

793 

794 conv_b(B, [Body]) var(B) f conv_call(B T Body). 

795 conv_br,'(C, B), [Call | Body]) 

796 !, conv_cail(C, Call), conv_b(B, Body). 

797 conv_b(Call, [Call]). % not a variable 

798 

799 % interpreter can process variable calls only within call 

800 conv_call(C, call(C)) var(C), !. 

801 conv_call(C, C). 

802 

803 % - - - remove a clause {this procedure is backtrackable) 

804 retract(CI) > 

805 nonvarint(CI), convert(Cl t Head, Body), !, 

806 functor{Head, Fun, Arity), remcls(Fun, Arity, 1, Head, Body). 

807 retract(CI) error(retract{CI)). 

808 

809 % ultimate failure if N too big (retract/3 fails) 

810 remcls(Fun, Arity, N t Head, Body) 

811 clause(Fun, Arity, N, N_head, N_body), 

812 remc!s(Fun, Arity, N, N_head, Head, Nbody, Body), 

813 

814 remcls(Fun, Arity, N, Head, Head, Body, Body) 

815 retract{Fun, Arity, N). 

816 % user's backtracking resumes retract here 

817 % (after removing the Nth clause the next becomes Nth) 

818 remcls(Fun, Arity, N, N_head, Head, N body, Body) > 

819 check(=(N_head, Head)), check(=(NJx>dy, Body)}, 

820 !, remcls(Fun t Arity, N, Head, Body). 

821 remcls(Fun, Arity, N, Head, Body) 

822 sum{N, 1, N1), remcis{Fun, Arity, N1, Head, Body). 

823 

824 % — generate nondeterminisfically all clauses whose head 

825 % and body match the parameters of clause 

826 dause(Head t Body) 

827 nonvarint(Head), I, functor(Head, Fun, Arity), 

828 gencls(Fun, Arity, 1, Head, Body). 
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829 clausefHead, Body) error(ctause(Head, Body)). 

830 

831 % generate; ultimate failure if N too big (clause/5 fails) 

832 gencls(Fun, Arity, N, Head, Body) 

833 clause(Fun, Arity, N, N_head, Nobody), 

834 gencls(Fun, Arity, N, NJiead, Head, N_body, Body). 

835 

836 % fail if N_head does not match Head, 

837 % or if N_body converted does not match Body 

838 gencls(_, _ ( N Jiead, Nhead, Nbody, Body) > 

839 convJbody(Body, Nobody). 

840 % user's backtracking resumes clause here 

841 gendsfFun, Arity, N ( Head, Body) 

842 sum(N, 1, N1), gencls(Fun, Arity, N1, Head, Body). 

843 

844 % I:::::::::::::::::::::::::::;::;;::::::;::::::::: 

845 % listing 

846 % I:::;::::::::::::::::::::::::::::::::::::;::::::::::::: 

847 % list procedures determined by the parameter (iisting(_)) 

848 % or all user's procedures (listing ) 

849 listing > 

850 proc(Head), listprocfHead), nl, fail. 

851 listing. % catch the final fail from proc 

852 

853 listing(Fun) atom(Fun), llstbyname(Fun)* 

854 lisling(7(Fun P Arity)) > 

855 atom (Fun), integer( Arity), -<(0, Arity), I, 

856 functor(Head, Fun, Arity), listproc(Head). 

857 listing(L) > 

858 isclosedlist(L), listseveral(L), I. 

859 listing(X) error(listing(X)). 

860 % isclosedlist - cf grammar rule preprocessor 

861 

862 listseveral(Q)* 

863 llstseveral{[ Item | Items]) :* 

864 listing(ltem), listseveralfltems)* 

865 

866 % all procedures with this name 

867 iJstbyname(Fun) 

868 proc(Head), functorfHead, Fun, _) t 

869 listprac(Head), nl, fail, 

870 listbyname(J. % succeed 

871 

872 % one procedure 

873 listprocfHead) 

874 clause(Head, Body), 

875 writeclause(Head t Body), wchf.), nl, fail. 

876 listprocfj* % succeed 

877 

878 writeclause<Head, Body) 

879 not(var(Body)), =[ Body, true), f, writeq(Head). 

880 writeclausefHead, Body) writeq(:-(Head, Body))* 
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881 

882 % 

883 % write 

884 %::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 

885 write(Tenn) side_effeds(outterm(Term, noq)). 

886 

887 % writeq encloses in quotes ell identifiers except words, 

888 % symbols and solochars (not coinciding with "fix” functors) 

889 writeq(Term) > side_effects(outterm(Term, q)T 

890 

891 writetext([Ch | Chs]) > !, wch(Ch), writetext(Chs)« 

892 writetext([]). 

893 

894 outterm(T, Q) nurhbervars(T, 1, J. outt(T, fd(_,J, Q), 

895 

896 % the real job is done here 

897 outt(VXM), J integer(N), !, wch(X), display(N), 

898 % CAUTION: outt is unable to write ’V'( Integer) 

899 outt(Term, J integer(Term), display(Term) ( L 

900 % the second parameter specifies a context for "fix” functors: 

901 % the nearest external functor and Term’s position 

902 % (to the left or to the right of the external functor) 

903 outt (Term, Context, Q) 

904 =„(Term, [Name | Args]), 

905 outfun(Name, Args, Context, Q), 

906 

907 % — output a functor-term 

908 % - as a “fix" term 

909 outfun(Name, Args, Context, Q) > 

910 isfix(Name t Args, This_ff, Kind), !, 

911 outff(Kind, Thisjf, [Name | Args], Context, O), 

912 % - as a list 

913 outfunf,, [Larg t Rarg], _, Q) 

914 ! f outlist([Larg | Rarg], Q), 

915 % - as a normal tunctor-term 

916 outfun(Name. Args, Q) 

917 outname(Name, Q), outargs(Args, Q). 

918 

919 % isfix constructs a pair ff(Prior, Associativity) , and 

920 % ‘in' or ‘pre 1 or 'post' (fails if not a “fix" functor) 

921 isfix(Name, [_, J. ff(Prior, Assoc), in) 

922 *FF(Name, Types, Prior), mk bin(Types, Assoc), 

923 isfix(Name, [J, ff(Prior, Assoc), Kind) 

924 ‘FF(Name, Types, Prior), mk_un(Types, Kind, Assoc). 

925 

926 % Binlype (if any) is before Untype (if any) 

927 mk_bin([Bintype | J, Assoc) > binary(Bintype, Assoc), 

928 mk_un([Untype], Kind, Assoc) unary(Untype, Kind, Assoc). 

929 mk_un([_ , Untype], Kind, Assoc) > unary(Untype, Kind, Assoc). 

930 % tests - see op 

931 

932 % — output a "fix" term (this outff has 5 parameters) 
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933 outff(Kind, Thisjf, NameArgs, Context, Q) 

934 agree(This_ff, Context), !, 

935 outff(Kind, This ff, NameArgs, Q). 

936 outff(Kind, This_ft, NameArgs, Q) 

937 wch('('), outff(Kind, This_ff, NameArgs, Q), wch(’)'). 

938 

939 % agree helps avoid (some) unnecessary brackets around the term 

940 agree(_, (d(Ext_ff, J) var(ExtJf). 

941 agree(ff(Prior1, J, fd(ff(Prior2, J, J) > 

942 stronger(Prior1, Prtor2). %cf the parser 

943 agree(ff( Prior, a(Dir)), fd{ff(Prior, a(Dir)), Dir)), 

944 

945 % output the functor and the arguments (this outff has 4 parameters) 

946 outff(in, Thisjf, [Name, Larg, Rarg], G) 

947 outt(Larg, fd(This_ff, f), Q), 

948 outfnfName, 1 % outt(Rarg T W(ThisJf, r) p Q). 

949 outff(pre, Thisff, [Name, Arg], Q) 

950 outfn(Name P 1 % outt(Arg, fd(ThisJf, r), Q). 

951 outff(post P This ff, [Name, Arg], Q) 

952 outt(Arg, fd(This ff, l), G) p outfn(Name. > \ 

953 

954 % output functor's name enclosed in Enel 

955 outfn(Name, Enel) wch(Encl) P display [Name), wch(Encl). 

956 

957 % - - - phnt a name (in quotes, if necessary) 

958 outname(Name, noq) !* display(Name). 

959 outname(Name P q) > 

960 'FF'fName, J, l outfn(Name, ,m ). 

961 outname(Name, q) > 

962 pname(Name, Namestring}, 

963 ch eck( noq (N a mestri ng)), l t display (Name), 

964 outnamejName, q) outfn(Name, 

965 

966 noq([Ch | String]) wordstart(Ch), isword(String). 

967 noq([Ch]) > solochar{Ch). 

968 noctfr.TD. 

969 noq([Ch | String]) symch(Ch), issym(String}. 

970 

971 isword(D). 

972 isword([Ch | String]) > alphanum(Ch) p isword(String). 

973 issym(O). 

974 issym([Ch | String]} symch(Ch), issym(String}, 

975 

976 % - - - output a list of arguments (of outfun) 

977 outargs([] t J l 

978 outargs(Args, Q) > 

979 fake(Context), wchf'O* outargsfArgs, Context, Q), wchO'). 

980 

981 outargs([Last], Context, G) > I, outt(Last, Context, Q), 

982 oulargs([Arg | Args], Context, Q) 

983 outt(Arg, Context, Q), display(\ '), outargs(Args, Context, Q}. 

984 
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985 

986 

987 

988 

989 

990 

991 

992 

993 

994 

995 

996 

997 

998 

999 

1000 
1001 
1002 

1003 

1004 

1005 

1006 

1007 

1008 

1009 

1010 
1011 
1012 

1013 

1014 

1015 

1016 

1017 

1018 

1019 

1020 
1021 
1022 

1023 

1024 

1025 

1026 

1027 

1028 

1029 

1030 

1031 

1032 

1033 

1034 

1035 

1036 


% commas are used to delimit list items, so we must bracket commas 
% within items (it’s a trick: we depend on having 
% the priority 1000 and being associative) 
fake(fd(ff(1000, na(J), J). 

% — output a list in square brackets (cf outfun - the main 
% functor is the dot, and the list cannot be empty) 
outlist([First | Tail], Q) 

fake(Context), wch(’[’), outt(First, Context, Q), 
outlist(Tail, Context, Q), wch(’]’). 

outlist([], J !. 
outlist([ltem | Items], Context, Q) 

!, display^ ’), outt(ltem, Context, Q), 
outlist(ltems, Context, Q). 

% the bar and the closing item (still bracketed if it contains commas) 
outlistfClosing, Context, Q) 

display^ | ’), outt(Ck>sing, Context, Q). 

% •**•***•*********♦*••**••••****•* 

% .. . 

% translator 
^ ********************************* 

% read a program upto end. and translate it into "kernel" form 
translate(lnfile, Outfile) 

see(lnfile), tell(Outfile), 
nl, repeat, 

read(Clause), put(Clause), nl, -(Clause, end), !, 
seen, told, see(user), tell(user). 

% — produce and output the translation of one clause 
put(:-(Head, Body)) 

!, puthead(Head, Symtab), putbody(Body, Symtab). 
put(—>(Left, Right)) 

!, tag(transl_njle(Left, Right, :-(Head, Body))), 
puthead(Head, Sym tab), putbody(Body, Sym tab). 
put(:-(Goal)) 

!, putbody(Goal, Sym tab), wch(#), nl, 
once(GoaO- % a failure here wouldn’t matter (cf translate) 
put(end) !. 
put(’e r O !. 

put(Unitclause) puthead(Unitclause, Sym tab), putbody(true, J. 

% — put a head call (it must be a functor-term) 
puthead(Head, Sym tab) 

nonvarint(Head), !, putterm(Head, Sym tab). 
puthead(Head, J transl err(Head). 

% — put a list of calls and [] at the end 
putbody(Body, Sym tab) 

punct(:), conv_body(Body, B), !, putbody_c(B, Sym tab). 
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1037 % see assert etc for conv body 

1038 

1039 putbody_c([], J > !, disp!ay([]). 

1040 putbody_c([Term | Terms], Symjab) 

1041 not(integer(Term)) p !, putlerm(Term p Symjab), 

1042 punctf.}, putbody_c(Terms P Symjab). 

1043 putbody_c([Term | J, _) > transt_err(Term). 

1044 

1045 punct(Ch) wchj 1 wch(Ch), nl, display^ % 

1046 

1047 % - - - put a term (with infix dots, and canonical otherwise) 

1048 putterm{Term, Symjab) 

1049 var(Term) t !, lookup(Term, Sym tab, *1, N), 

1050 wch(;), display(N). 

1051 putterm{Term, J :* integer(Term) P !, display (Term). 

1052 putterm([Head | Tail], Symjab) 

1053 !, puttermJnlist(Head, Symjab), 

1054 display^ . putterm(Tail, Symjab). 

1055 putterm(Term, Symjab) > 

1056 -..(Term, [Name | Args]}, outfn(Name, ""J, % cf WRITE 

1057 putargs(Args, Symjab). 

1058 

1059 % Symjab is an open list of pairs vn( Variable, Number) 

1060 % (this formulation helps avoid too many additions) 

1061 lookups SJ_end, PreviousN, N) 

1062 var(3j_end), \, sum(PreviousN, 1, N), 

1063 =(SJ_end, [vn(V, N) | New_sJ_end]). 

1064 !ookup(V p [vn(CurrV, CurrN) ] J, CurrN) > 

1065 eqvar(V, CurrV), !. 

1066 lookup(V, [vn(_, CurrN) | S t fail], N) > 

1067 lookup(V, S t tait, CurrN, N). 

1088 

1069 % arguments - nothing, or a list of terms in parentheses 

1070 putargs([] T J > !. 

1071 putargs(Args ( Sym tab) > 

1072 wch(T}, putarglist(Args, Sym tab), wch(y). 

1073 

1074 putarglist([Arg], Symjab) > J, putterm(Arg T Symjab). 

1075 putangIist([Arg | Args], Sym tab) 

1076 putterm(Arg, Symjab), display(\ % 

1077 putarglist(Args, Symjab). 

1078 

1079 % - - - a list within a list must be enclosed in parentheses 

1080 puttermjnl ist (Term, Symjab) > 

1081 nonvarintfTerm), —(Term, |_ | J), !, 

1082 wchff), putterm(Term, Symjab), wch(')’). 

1083 puttermjnl ist (Term, Symjab) puttermfTerm, Sym tab), 

1084 

1085 % - - - error handling (only one error is discovered by translate) 

1086 transl^err(X) > 

1087 nl, display(V++ Bad head or call: *), display(X), nl, fail, 

1088 

1089 see( user}, ear. 
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Three Useful Programs 


A simple editor 


% A simple interactive clause editor 
% Watch for name conflicts with its procedures 1 
% Note that this version has no safeguards against Prolog's crash 
% (eg. due to stack overflow). 

% Call edit( name/arity ) to edit the procedure of this name and arity. 

% Each invocation of edit is associated with a cursor, which is the number 
% of a clause. Initially the cursor is at clause 0, Le. before the first 
% clause in this procedure. The cursor's value and its associated clause 
% is usually displayed between commands. 

% Commands are listed below. Terminate the line immediately after typing 
% last character. Don't use blanks where not shown and only one where shown. 


% 

% 

% 

% 

% 

% 

% 

% 

% 

% 


Commands 


e Name/Ahty - invoke a nested instance to edit another procedure. 

The current cursor stays in place unless you happen 
to modify this procedure within a nested instance, 
x - exit from the current editor instance. 

+ - move the cursor to the next clause, no action if none. 

<cr> - an empty line is an alternative form of +. 

% - - move the cursor to the previous clause, no action if at 0. 

% t - top : move the cursor to 0. 

% b - bottom : move the cursor to the bottom clause 

% (0 for empty procedures), 

% I - list the whole procedure. 

% d - delete the current clause and move the cursor to 

% the next (or to the new bottom if bottom is deleted). 

% i - insert after the current clause. In the following 

% lines write clauses as you would after consult(user) 

% (terminate the sequence with end.) The cursor is 

% positioned at the last inserted clause. 

% f Filename - like i f but read the clauses from a file, 

% Take care ! filename correctness is not checked. 

% p - invoke a nested instance of Prolog. If there is 

% no memory overflow, invoking stop will return 

% control to the editor, 

% 


edit( Name/Arity ) not ( atom( Name ), integerf Arity )), !, 
write( 'Bad parameters : 1 ), 
write( edit( Name/Arity )), nl, fail. 
edit( Name/Arity ) predefined( Name, Arity ), !» 

write( 'Gan'l edit system routine : 1 ) t 
write( Name/Arity ), nl, fail. 
edtt( Name Arity ) tag( ed( NameArity, 0 }). 
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ed( NameArity, Cursor) > show( NameArity, Cursor ), !, 

doomd( NameArity, Cursor, NewCursor), 
ed( NameArity, NewCursor). 

ed< NameArity, Cursor) > display! "Cursor out of range : *), 
display) Cursor), nl, ed( NameArity, 0 ), 
docmd( NameArity, Cursor, NewCursor) > 

repeat, % repeat over incorrect commands 
^ get line) Line }, cmd( Line, NameArity, Cursor, NewCursor ), 

getline) Q ) rch, lastch) C ), iseoln{ C ), l 
getline{ [ C | L ]) > lastch( C ), getline) L ). 

% cmd fails for incorrect commands. 

cmd( [], NmAr, Cur, NCur) > nextcursor) NmAr, Cur, NCur). 
cmd( r+l, NmAr, Cur, NCur) > next cursor) NmAr, Cur, NCur), 
cmd{ j'-'J, _, Cur, NCur) > prevcursor) Cur, NCur). 
cmd( ft], NmAr, 0 ). 

cmd( [bj, NmAr, Cur, NCur) :* bottom_cursor( NmAr, Cur, NCur). 
cmd( [q p NmAr, Cur, Cur) listing) NmAr). 
cmd( [d] t NmAr. Cur, NCur) > delete) NmAr, Cur, NCur). 
cmd) [ij p NmAr, Cur, NCur) insert) NmAr, Cur, NCur). 
cmd) [f/ * | NameString], NmAr, Cur, NCur) 

filejnsert) NameString, NmAr, Cur, NCur). 
cmd) [e, f 1 1 ArgsJ, NmAr, Cur, Cur) 

append) NameString, [7 | ArityString], Args ), 
call edit) NameString, ArityString). 
cmd( [x|, > tagexit) ed( _,_))■ 

cmd) [p], NmAr, Cur, Cur) > invoke Prolog. 
cmd) String, display) 1 —inoorrect command : 1 

wrrtetext) String), nl, fail. 

% check is provided with the standard library ( check(C)not not C ) 
next_cursor( Name/Arity, Cursor, Next) 

Next is Cursor + f, check) clause) Name, Arity, Next, _ }), L 
nextcursor) Cursor, Cursor ) T % cursor at last clause 

prevcursor) 0, 0 ). 

prevcursor) Cursor, Prev ) Cursor > 0, Prev is Cursor ^ 1, 

bottom_cursor( Name/Arity, Cursor, Bottom ) 

Next is Cursor + 1, check) clause) Name, Arity, Next, _ )), 
l» bottom_cursor( Name/Arity, Next, Bottom ). 
bottom__cursor( Cursor, Cursor). 

delete) 0, 0) > !, display) 'Can 3 ! delete clause 0"), nl. 
delete) Name/Arity, Cursor, NewCursor) 
retract) Name, Arity, Cursor), 
cursor in range) Name, Arity, Cursor, NewCursor). 

cursor grange) Nm, Ar, Cur, Cur) 

check) clause) Nm, Ar, Cur, b , _ )), !. 
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cursorJn_range( Cur, Prev) > Prev is Cur -1, 


% convert is defined in the standard library 
insert( NameArity, Cursor, NewCursor) 

repeat, % get end, or a clause of Name/Arity, skip others 
read( Clause), convert! Clause, Head, Body), 
aocept( Head, NameArity, Clause ), 

!, 

end_or_proceed( Head, Body, NameArity, Cursor, NewCursor). 

end or_proceed{ end, Q, Cursor, Cursor) L 
end_or_proceed( Head, Body, NameArity, Cursor, NewCursor) 

Next is Cursor + 1, assert( Head, Body, Cursor), 
insert! NameArity, Next, NewCursor ), 

accept! _»end)> 

accept! Head, Name/Arity, _ ) functor) Head, Name, Arity )♦ 
accept! Clause ) 

display! '—clause not in edited procedure - ignored'), 
nl, write! Clause ), fail. 


file insert! FNameSlring, NameArity, Cursor, NewCursor) 
pname( FileName, FNameString), 
see( FileName ), insert! NameArity, Cursor, NewCursor), 
seen, see( user ) t 


call_edit( NameString, ArityString ) > 

pname{ Name, NameString ), pnamei) Arity, ArityString ), 
edit! Name/Arity ). 


invoke_Prolog tag{ loop }. % this works only for the Toy-Pro log monitor 
invoke Proiog, % ( loop terminated by tagfail > 


% conv_body is defined in the standard library tasserta etc.), 

% so is writeclause. 

show( NameArity, 0 ) 1, write! '[0] ('). write! NameArity ), 

write! T )> nl 
show( Name/Arity, Cursor) 

side effeotsf { clause! Name, Arity, Cursor, Head, Body ), 
conv_body( NiceBody, Body), 
display!'[’), display! Cursor), 
display( ’1 ’), 

writeclause! Head, NiceBody), 
display! 7 ) t nl) ). 


append! [], L, L ). 

append! [ E | L ], L2, [ E | LL2 )) append! L, L2, LL2 ). 


286 








APPENDIX A,4 ( Continued ) 


A primitive tracing tool 


% A primitive tracing package, 

% Watch for name conflicts with Its procedures ! 

% Use spy( Pattern ) to trace calls matching Pattern, 

% nospy( Pattern } to stop tracing. 

% To trace, execute trace! Goal J instead of Goal, 

% Successful calls are displayed with a plus, failing calls with a minus. 

% Note: tagcut* tagexit, tagfail and ancestor will not be executed properly. 
% trace is slow : if you wish to have the insides of a correct and 
% costly procedure executed at normal speed, add 

% a predefined!..) assertion for rts call. 

$py( Ail)var( Alll, assert! spied( All)). 

spy( Pattern)spied( Pattern), !. % spied already 

spy( Pattern ) assert( spied( Pattern )). 

nospy( Pattern ) > retract! spied{ Pattern )), fail. 
nospy( _). 

trace! Goal) > tag( runbody( Goal)). 

runbody! (A ( B) ) t, runbody! A ), runbody( B ). 
runbodyi (A ; B) >:-!,{ runbody! A ) ; runbody{ B }}. 
runbodyi call( Gall)) > 

var( Call), 1, showfaiiure{ call( Call)), fail. 
runbody( call( Cali)) 1, mnbody( Call}. 

runbody( tag( Call)) :- I, runbody! cafl{ Call)). 
runbody! Call) predefined! Call}, I, runsystem( Gall), 
runbodyi Call) tag< mnuser( Call)). 


runsystem( I) runout, 
runsystemj Forbidden) > 

isforbiddenf Forbidden), f, nl> 
display! 'FORBIDDEN CALL f ) t write! Forbidden ), 
display!* FAILS !' >, nl, fail, 
runsystemf Gall) not spied( Call), !, Call, 
runsystemi Gall) > Call* !, showsuccess{ Call}. 
runsystemi Call) > showfailure( Call ) f fall. 


runuser( Call) > not spied! Call), L 

clause! Call, Body ) t runbody{ Body ). 
runuser( Calf) ;* clause! Galf t Body }, showsuocess( Cali), 
runbody! Body). 

runuser( Call) showfailure( Call), fail 


rnncutspied! I), simulatecut, showsuccess{ 1). 
runout simulatecut. 

simulatecut tagcut! arnusert _ )). 

simulatecut > tagcutj runbody! _ > )■ % cut in initial goal 


287 






APPENDIX A.4 (Continued) 


showsuccess( Call) display{ 1 + *). write! Call), nl. 

showfailuref Call) display!' - *write! Call ), nl. 

isforbiddenf tagexit( _)). 
isforbidden( tagfail( _ )). 
isforbidden( tagcurt( _ )). 

Esforbidden( ancestor! _) )♦ 

predefined! Call) > 

check( (functor! Call, F.„ N ), predefined! f t N )) 


m 
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A program structure analyser with analyser analysed 


% Given a procedure name and arrty, print its call tree. 

% The main data structure is a queue of procedures whose tail contains 
% calls which were not yet seen. Each element of the queue contains a list 
% of calls (references to main queue elements) and a variable to hold its 
% ordinal number in the listed tree, 

% Queues are searched linearly : the algorithm is costly for large trees. 

% CAUTION ; don't attempt to list a trace of this program - cyclic structures 
% are formed as a mfe. 


calltree( Name/Arity ) add* pnoc* Name, Arity, Ord, Calls ), Queue ), 
fill( Queue, Queue ), 

print calls *[proc(Na me, A rity, Ord, Ca I Is)], 3,1, J. 

% add finds (inserts) an element in (to ) an open list 

add( El, [El | Tail]) I. 

add( EL L I Tall]) > add( EL Tail). 

% fill walks the queue and expands procedures, inserting their calls into 
% the queue if not yet seen. Queue beginning is passed along to allow search. 
m ( []- _) > l % evidently reached the terminating variable 

fill( [procfName,Arrty,_,[])|QTail], Q ) predefined* Name / Ahty ), !, 

fill( QTarl, Q). 

filf( [procfName,Arity.^undefined) IQTail], Q ) 

not clause( Name. Arity, 1, L fill( QTaiL Q ). 
fil!( [p roc( Name, ArityC a lls) |QTail], Q) > 

addcalls* Name, Arity. 1, Calls, Q ), fill* QTail Q ), 

% system procedures and procedures defined in the monitor should not be shown 
predefined( Name / Arity ) predefined( Name, Arity ). 

% only the more commonly used procedures (but the list is easily extended) 

predefined( ’not 1 /1 ). predefined* nl / 0 ), 

predefined* read /1 ), predefined* V/ / 2 ). 

predefined! op / 3 ). predefined* 'is' 12). 

predefined* assert /1 ). predefined* assertz /1 ). 

predefined* retract / 1 )„ predefined* clause / 2). 

predefined* write /1 ), predefined* writeq / f). 

% addcalls processes the clauses of a procedure, adding calls to its list 
% of calls and to the queue (only finding in the queue if already there) 
add_catls( Name, Arity, N, Calls, Q ) 

clause* Name, Arity, N, _ ( Body ), !, 
body calls* Body, Calls, Q ), Nl is N + 1, 
add calls* Name, Arity, Nl, Calls, Q ). 
addcalis* 0, _ ) !. % close the list if empty 

% ( only unit clauses ) 

add calls* _ ( _ }. % non-empty list left open 

body_cal!s( [}, > !. 
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body_calls( [Call [ BodyTail], Calls, Q) > 

functor! Call, Name, Arity ), 
add( proctNamepAritypOrd.Catlees), Calls }, 
add! proc(Name,Arity,Ord.Callees), Q), 
addjnsides( Call, Calls, Q ), 
body_calts( BodyTail, Calls, Q }. 

% addjnsides unpacks metalogical calls: if their arguments are not variable 
% or integer, they are added to the queues, 
add insides! Call, G1, Q2 )meta_call_l( Call, Arg), I, 
add_inside( Arg, Q1, Q2 ), 

add insidesj Call, Q1, Q2 ) metacall_2( Call, Argl, Arg2 ), !, 
add inside! Arg1,Q1,Q2), 
add inside! Arg2, Q1, Q2). 

addjnsides( _, _ ), 

addjnside( V, _) > ( var( V ) ; integer! V )}, I. 
addjnsidej Call, Q1, Q2 ) > functor! Call, Name, Arity ), 

add( proc(Name,Arity,OrdpCallees), G1 ), 
add( proc(Name, Arity ,Ord,Ca1lees), G2 ), 
addjnsides( Call, Gt, Q2 ), 


meta_call_1{ call{ Call), Call). 
meta__calU< tag! Call), Call), 
meta_calM{ not Call, Call). 
meta_calM{ check( Call), Call). 
meta_ca1l_1( side_effects( Call), Call). 
meta_call_1( once( Call), Call). 

meta_cal!_2( ( A , B ), A, B ). 
meta_cal1_2{ { A ; B ), A, B ). 

% Print calls, starting at given tab setting and ordinal, returning next ordinal 
% number. Third clause fails if ordinal numbers don't match, i.e. proc 
% was already printed in another line. 

print_calls( |], Ord, Ord ) > L % this matches the terminating var 
% of a call list. 

print calls{ [proc{Name,Arity,Ord,undefined) |CalIs], Tab, Ord, NOrd ) 

I, start undetinedf Ord, Tab ), 

writeq{ Name/Arity ), display! 1 “undefined*" ), nl, 

TOrd is Ord + 1, print j;alls! Calls, Tab, TOrd, NOrd ). 
print_calls( [proc(Name,Arity 1 Ord,Callees)|Calls], Tab, Ord, NOrd ) 
f t start line! Ord, Tab ), writeqf Name/Arity ), nl, 

InnerTab is Tab + 3, InnerOrd is Ord + 1, 
print_calls( Callees, InnerTab, InnerOrd, TOrd ), 
print_calls( Calls, Tab, TOrd, NOrd ). 
print calls( [proc(Name,Arity,AnotherOrd, J|Calls], Tab, Ord, NOrd ) 
start unnumbered line! Tab), writeqf Name/Arity), 
repetition! Name, Arity, AnotherOrd ), nl, 
print_cal!s( Calls, Tab, Ord, NOrd ), 
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repetition! Name, Arity, _) predefined! Name, Arity ), L 
repetition! Ord ) display! T {see 1 ), display( Ord ), 


disp!ay{')'). 


% Ord numbers are printed in 4 columns, right justified 
start_iine( Ord, Tab ) number line( Ord ), f, tab( Tab, * 4 ). 
numberjine( N ) > N < 10, display! * *), display{ N ). 
number_line( N )N < 100, display! 1 ’}, display! N ). 
numberJine( N }:- N < 1000, display! * * ), display! N ), 
number line( N ) display! N ). 


start_unnumbered_line( Tab ) > display! 1 1 ) T tab{ Tab 1 11 ), 

start_undeflned< Ord, Tab ) number_line( Ord >, tab{ Tab ,7). 
tab{ 0, _ ) !. 

tab( N, Ch ) > wch( Ch ), N1 is N - 1, tab( N1, Ch ), 

?- % ------ 

% a sample call and results 
> cal It reef calltree /1 ). 

1 calltree /1 

2 add / 2 

3 1/0 

add / 2 (see 2} 

4 fill / 2 

f/0 

5 predefined /1 

6 predefined / 2 
fill / 2 (see 4) 

7 ’not' /1 

8 clause / 5 

9 add calls / 5 

clause / 5 
1/0 

10 body_calls / 3 


11 


f/0 

functor/3 
add / 2 (see 2) 


12 

13 


add insides / 3 
meta_call_1 / 2 
1/0 

addjnside / 3 

7/2 


14 

15 

16 

17 

18 


var /1 
integer /1 
! / 0 

functor / 3 

add / 2 (see 2) 

add insides / 3 (see 12) 


call /1 
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19 meta_cali_2 f 3 
bodycalls / 3 (see 10) 

20 'is' / 2 
add_calls/5 (see 9) 

2t print_calls/4 
I/O 

22 startjjndefined / 2 

23 number line /1 

24 '<’ / 2 

Is' f 2 (see 20) 

26 less / 2 

26 display /1 

27 tab / 2 

I/O 

28 wch /1 

'is 1 / 2 (see 20) 
tab / 2 (see 27) 

29 writeq / 1 
display /1 

30 nl f 0 

S is' / 2 (see 20} 
print calls / 4 (see 21) 

31 start line / 2 

numberjine /1 (see 23) 
I/O 

tab / 2 (see 27) 

32 startjjnnumberedJine /1 

display /1 
tab / 2 (see 27) 

33 repetition / 3 

predefined / 2 
I/O 

display /1 
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procedures 
accessing, 160-163 
built-in, 12, 147-149 
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side-effects of, 27 
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static interpretation of, 32-33 
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Sequel, 233 
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copy, 175-176 
global, 176 
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Strategy, 51-58 
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Structure sharing, 168-172, 176-178 
Switching streams, 153-154 
Symbol, 2, 145 
Syntax of Prolog, 143-147 
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built-in, 147 
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Tall recursion optimisation (TRO), 179-182 
delayed, 182 
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instantiation of, 10-11 
Term descriptions, 168 
Term handle, 169 
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Theory, 44 
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Translator, 214 
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Two-level grammars, 82 
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success and failure, 17 
Unit clause, 32, 47, 144 
Univ, see Built-in procedures, -J 2 
Universe, 43 
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153 

V 


bound, 9 
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PROLOG FOR PROGRAMMERS 
From the reviews 

"It contains material I have not been able to find in any other book — 
material that is a significant part of the Prolog ‘culture It offers an excellent 
discussion of metamorphosis grammars and the best treatment of data 
structures that I have encountered.... It is in many ways a wonderful 
book. ” 

Overbeek's Outlook, June 1986 

"A high competence practical guide for the Prolog programmer. ... it 
should be on everyone's desk." 

Zentralbtatt tur Mathematik, 1986 

This is a self-contained handbook of advanced logic programming. It 
assumes the reader is not a novice to computer science, has perhaps read 
an introductory book on Prolog and would now like to do some non-trivial 
work using the language. 

A thorough discussion of Prolog treatment of data structures, an overview 
of Prolog’s logical foundations, a concise but comprehensive presentation 
of logic grammars, simple and advanced programming techniques, style 
and efficiency guidelines, a real exercise in Prolog implementation, and 
two non-trivial case studies make this an invaluable handbook for both 
professional and student. Experienced Prolog programmers may find the 
systematic exposition helpful, and the discussion of implementation issues 
instructive. 

One of the unique features of the book is its detailed coverage of Meta¬ 
morphosis (or Definite Clause) Grammars: programmers applying Prolog 
to natural or formal language processing will definitely appreciate this. 

A small but complete interpreter is provided on the disk in source form 
(Pascal and Prolog) so that readers can use it to run their programs or 
take it apart to study in detail to achieve an in-depth understanding of 
Prolog. 
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